Skip to content

Self-hosted: make the local embedding model configurable (hard-coded bge-base-en-v1.5 breaks non-English recall) #1104

@klapom

Description

@klapom

Problem

The self-hosted server hard-codes the local embedding model to Xenova/bge-base-en-v1.5 (binary strings show only pool/batch/thread knobs: SUPERMEMORY_LOCAL_EMBEDDING_POOL_SIZE, …_BATCH_SIZE, …_WASM_THREADS, …_IDLE_TIMEOUT_MS — no model option).

bge-base-en-v1.5 is English-only. For non-English deployments this makes recall effectively unusable:

  • German memory "P78-Smoke: Klaus' Lieblings-Testfrucht ist die Physalis (Geheimcode PHYSALIS-88)." (status done, fresh v0.0.2 store)
  • German query, English query, even the verbatim token PHYSALIS-88{"results":[],"total":0}, thresholds down to 0.01 have no effect
  • Identical setup with English content + English query → found with score 0.83

So on the same healthy store: English in/out works, German content is unreachable. (We are running a fully-local German-language agent platform — every memory the agents write is German.)

Proposal

Add an env override, e.g.:

SUPERMEMORY_LOCAL_EMBEDDING_MODEL=Xenova/bge-m3        # or any transformers.js-compatible model id
SUPERMEMORY_LOCAL_EMBEDDING_DIMENSIONS=1024            # if it cannot be inferred
  • Multilingual candidates that work with transformers.js/ONNX today: Xenova/bge-m3, Xenova/multilingual-e5-base, Xenova/paraphrase-multilingual-MiniLM-L12-v2.
  • The model cache directory mechanism already exists (<data>/models/Xenova/…), so this is mostly plumbing the model id + dimension through worker init and the vector schema.
  • Docs should note that changing the model requires re-ingestion (embeddings from different models are not comparable) — a reindex command would cover that (same ask as Self-hosted v0.0.2: search always returns total:0 on a store upgraded from v0.0.1 (fresh store works) #1103).

Workarounds we considered (and why they're poor)

  • Translating memory content to English at write time: corrupts the canonical memory text permanently.
  • Storing bilingual content: doubles size and pollutes extraction.

Happy to test an arm64 build with a multilingual model — we have a 100 % German repro on hand.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions