{"id":633,"date":"2026-03-21T22:52:57","date_gmt":"2026-03-21T14:52:57","guid":{"rendered":"https:\/\/pa.yingzhi8.cn\/index.php\/2026\/03\/21\/reference-memory-config\/"},"modified":"2026-03-21T23:30:57","modified_gmt":"2026-03-21T15:30:57","slug":"reference-memory-config","status":"publish","type":"post","link":"https:\/\/pa.yingzhi8.cn\/index.php\/2026\/03\/21\/reference-memory-config\/","title":{"rendered":"\u5185\u5b58\u914d\u7f6e\u53c2\u8003"},"content":{"rendered":"<h1>Memory configuration reference<\/h1>\n<p>This page covers the full configuration surface for OpenClaw memory search. For<br \/>\nthe conceptual overview (file layout, memory tools, when to write memory, and the<br \/>\nautomatic flush), see <a href=\"\/concepts\/memory\">Memory<\/a>.<\/p>\n<h2>Memory search defaults<\/h2>\n<ul>\n<li>Enabled by default.<\/li>\n<li>Watches memory files for changes (debounced).<\/li>\n<li>Configure memory search under <code>agents.defaults.memorySearch<\/code> (not top-level<br \/>\n  <code>memorySearch<\/code>).<\/li>\n<li>Uses remote embeddings by default. If <code>memorySearch.provider<\/code> is not set, OpenClaw auto-selects:<br \/>\n  1. <code>local<\/code> if a <code>memorySearch.local.modelPath<\/code> is configured and the file exists.<br \/>\n  2. <code>openai<\/code> if an OpenAI key can be resolved.<br \/>\n  3. <code>gemini<\/code> if a Gemini key can be resolved.<br \/>\n  4. <code>voyage<\/code> if a Voyage key can be resolved.<br \/>\n  5. <code>mistral<\/code> if a Mistral key can be resolved.<br \/>\n  6. Otherwise memory search stays disabled until configured.<\/li>\n<li>Local mode uses node-llama-cpp and may require <code>pnpm approve-builds<\/code>.<\/li>\n<li>Uses sqlite-vec (when available) to accelerate vector search inside SQLite.<\/li>\n<li><code>memorySearch.provider = \"ollama\"<\/code> is also supported for local\/self-hosted<br \/>\n  Ollama embeddings (<code>\/api\/embeddings<\/code>), but it is not auto-selected.<\/li>\n<\/ul>\n<p>Remote embeddings <strong>require<\/strong> an API key for the embedding provider. OpenClaw<br \/>\nresolves keys from auth profiles, <code>models.providers.*.apiKey<\/code>, or environment<br \/>\nvariables. Codex OAuth only covers chat\/completions and does <strong>not<\/strong> satisfy<br \/>\nembeddings for memory search. For Gemini, use <code>GEMINI_API_KEY<\/code> or<br \/>\n<code>models.providers.google.apiKey<\/code>. For Voyage, use <code>VOYAGE_API_KEY<\/code> or<br \/>\n<code>models.providers.voyage.apiKey<\/code>. For Mistral, use <code>MISTRAL_API_KEY<\/code> or<br \/>\n<code>models.providers.mistral.apiKey<\/code>. Ollama typically does not require a real API<br \/>\nkey (a placeholder like <code>OLLAMA_API_KEY=ollama-local<\/code> is enough when needed by<br \/>\nlocal policy).<br \/>\nWhen using a custom OpenAI-compatible endpoint,<br \/>\nset <code>memorySearch.remote.apiKey<\/code> (and optional <code>memorySearch.remote.headers<\/code>).<\/p>\n<h2>QMD backend (experimental)<\/h2>\n<p>Set <code>memory.backend = \"qmd\"<\/code> to swap the built-in SQLite indexer for<br \/>\n<a href=\"https:\/\/github.com\/tobi\/qmd\">QMD<\/a>: a local-first search sidecar that combines<br \/>\nBM25 + vectors + reranking. Markdown stays the source of truth; OpenClaw shells<br \/>\nout to QMD for retrieval. Key points:<\/p>\n<h3>Prerequisites<\/h3>\n<ul>\n<li>Disabled by default. Opt in per-config (<code>memory.backend = \"qmd\"<\/code>).<\/li>\n<li>Install the QMD CLI separately (<code>bun install -g https:\/\/github.com\/tobi\/qmd<\/code> or grab<br \/>\n  a release) and make sure the <code>qmd<\/code> binary is on the gateway&#8217;s <code>PATH<\/code>.<\/li>\n<li>QMD needs an SQLite build that allows extensions (<code>brew install sqlite<\/code> on<br \/>\n  macOS).<\/li>\n<li>QMD runs fully locally via Bun + <code>node-llama-cpp<\/code> and auto-downloads GGUF<br \/>\n  models from HuggingFace on first use (no separate Ollama daemon required).<\/li>\n<li>The gateway runs QMD in a self-contained XDG home under<br \/>\n  <code>~\/.openclaw\/agents\/&lt;agentId&gt;\/qmd\/<\/code> by setting <code>XDG_CONFIG_HOME<\/code> and<br \/>\n  <code>XDG_CACHE_HOME<\/code>.<\/li>\n<li>OS support: macOS and Linux work out of the box once Bun + SQLite are<br \/>\n  installed. Windows is best supported via WSL2.<\/li>\n<\/ul>\n<h3>How the sidecar runs<\/h3>\n<ul>\n<li>The gateway writes a self-contained QMD home under<br \/>\n  <code>~\/.openclaw\/agents\/&lt;agentId&gt;\/qmd\/<\/code> (config + cache + sqlite DB).<\/li>\n<li>Collections are created via <code>qmd collection add<\/code> from <code>memory.qmd.paths<\/code><br \/>\n  (plus default workspace memory files), then <code>qmd update<\/code> + <code>qmd embed<\/code> run<br \/>\n  on boot and on a configurable interval (<code>memory.qmd.update.interval<\/code>,<br \/>\n  default 5 m).<\/li>\n<li>The gateway now initializes the QMD manager on startup, so periodic update<br \/>\n  timers are armed even before the first <code>memory_search<\/code> call.<\/li>\n<li>Boot refresh now runs in the background by default so chat startup is not<br \/>\n  blocked; set <code>memory.qmd.update.waitForBootSync = true<\/code> to keep the previous<br \/>\n  blocking behavior.<\/li>\n<li>Searches run via <code>memory.qmd.searchMode<\/code> (default <code>qmd search --json<\/code>; also<br \/>\n  supports <code>vsearch<\/code> and <code>query<\/code>). If the selected mode rejects flags on your<br \/>\n  QMD build, OpenClaw retries with <code>qmd query<\/code>. If QMD fails or the binary is<br \/>\n  missing, OpenClaw automatically falls back to the builtin SQLite manager so<br \/>\n  memory tools keep working.<\/li>\n<li>OpenClaw does not expose QMD embed batch-size tuning today; batch behavior is<br \/>\n  controlled by QMD itself.<\/li>\n<li><strong>First search may be slow<\/strong>: QMD may download local GGUF models (reranker\/query<br \/>\n  expansion) on the first <code>qmd query<\/code> run.<\/li>\n<li>OpenClaw sets <code>XDG_CONFIG_HOME<\/code>\/<code>XDG_CACHE_HOME<\/code> automatically when it runs QMD.<\/li>\n<li>\n<p>If you want to pre-download models manually (and warm the same index OpenClaw<br \/>\n    uses), run a one-off query with the agent&#8217;s XDG dirs.<\/p>\n<p>OpenClaw&#8217;s QMD state lives under your <strong>state dir<\/strong> (defaults to <code>~\/.openclaw<\/code>).<br \/>\nYou can point <code>qmd<\/code> at the exact same index by exporting the same XDG vars<br \/>\nOpenClaw uses:<\/p>\n<p>&#8220;`bash  theme={&#8220;theme&#8221;:{&#8220;light&#8221;:&#8221;min-light&#8221;,&#8221;dark&#8221;:&#8221;min-dark&#8221;}}<\/p>\n<h1>Pick the same state dir OpenClaw uses<\/h1>\n<p>STATE_DIR=&#8221;${OPENCLAW_STATE_DIR:-$HOME\/.openclaw}&#8221;<\/p>\n<p>export XDG_CONFIG_HOME=&#8221;$STATE_DIR\/agents\/main\/qmd\/xdg-config&#8221;<br \/>\nexport XDG_CACHE_HOME=&#8221;$STATE_DIR\/agents\/main\/qmd\/xdg-cache&#8221;<\/p>\n<h1>(Optional) force an index refresh + embeddings<\/h1>\n<p>qmd update<br \/>\nqmd embed<\/p>\n<h1>Warm up \/ trigger first-time model downloads<\/h1>\n<p>qmd query &#8220;test&#8221; -c memory-root &#8211;json &gt;\/dev\/null 2&gt;&amp;1<br \/>\n&#8220;`<\/p>\n<\/li>\n<\/ul>\n<h3>Config surface (<code>memory.qmd.*<\/code>)<\/h3>\n<ul>\n<li><code>command<\/code> (default <code>qmd<\/code>): override the executable path.<\/li>\n<li><code>searchMode<\/code> (default <code>search<\/code>): pick which QMD command backs<br \/>\n  <code>memory_search<\/code> (<code>search<\/code>, <code>vsearch<\/code>, <code>query<\/code>).<\/li>\n<li><code>includeDefaultMemory<\/code> (default <code>true<\/code>): auto-index <code>MEMORY.md<\/code> + <code>memory\/**\/*.md<\/code>.<\/li>\n<li><code>paths[]<\/code>: add extra directories\/files (<code>path<\/code>, optional <code>pattern<\/code>, optional<br \/>\n  stable <code>name<\/code>).<\/li>\n<li><code>sessions<\/code>: opt into session JSONL indexing (<code>enabled<\/code>, <code>retentionDays<\/code>,<br \/>\n  <code>exportDir<\/code>).<\/li>\n<li><code>update<\/code>: controls refresh cadence and maintenance execution:<br \/>\n  (<code>interval<\/code>, <code>debounceMs<\/code>, <code>onBoot<\/code>, <code>waitForBootSync<\/code>, <code>embedInterval<\/code>,<br \/>\n  <code>commandTimeoutMs<\/code>, <code>updateTimeoutMs<\/code>, <code>embedTimeoutMs<\/code>).<\/li>\n<li><code>limits<\/code>: clamp recall payload (<code>maxResults<\/code>, <code>maxSnippetChars<\/code>,<br \/>\n  <code>maxInjectedChars<\/code>, <code>timeoutMs<\/code>).<\/li>\n<li><code>scope<\/code>: same schema as <a href=\"\/gateway\/configuration-reference#session\"><code>session.sendPolicy<\/code><\/a>.<br \/>\n  Default is DM-only (<code>deny<\/code> all, <code>allow<\/code> direct chats); loosen it to surface QMD<br \/>\n  hits in groups\/channels.<\/li>\n<li><code>match.keyPrefix<\/code> matches the <strong>normalized<\/strong> session key (lowercased, with any<br \/>\n    leading <code>agent:&lt;id&gt;:<\/code> stripped). Example: <code>discord:channel:<\/code>.<\/li>\n<li><code>match.rawKeyPrefix<\/code> matches the <strong>raw<\/strong> session key (lowercased), including<br \/>\n    <code>agent:&lt;id&gt;:<\/code>. Example: <code>agent:main:discord:<\/code>.<\/li>\n<li>Legacy: <code>match.keyPrefix: \"agent:...\"<\/code> is still treated as a raw-key prefix,<br \/>\n    but prefer <code>rawKeyPrefix<\/code> for clarity.<\/li>\n<li>When <code>scope<\/code> denies a search, OpenClaw logs a warning with the derived<br \/>\n  <code>channel<\/code>\/<code>chatType<\/code> so empty results are easier to debug.<\/li>\n<li>Snippets sourced outside the workspace show up as<br \/>\n  <code>qmd\/&lt;collection&gt;\/&lt;relative-path&gt;<\/code> in <code>memory_search<\/code> results; <code>memory_get<\/code><br \/>\n  understands that prefix and reads from the configured QMD collection root.<\/li>\n<li>When <code>memory.qmd.sessions.enabled = true<\/code>, OpenClaw exports sanitized session<br \/>\n  transcripts (User\/Assistant turns) into a dedicated QMD collection under<br \/>\n  <code>~\/.openclaw\/agents\/&lt;id&gt;\/qmd\/sessions\/<\/code>, so <code>memory_search<\/code> can recall recent<br \/>\n  conversations without touching the builtin SQLite index.<\/li>\n<li><code>memory_search<\/code> snippets now include a <code>Source: &lt;path#line&gt;<\/code> footer when<br \/>\n  <code>memory.citations<\/code> is <code>auto<\/code>\/<code>on<\/code>; set <code>memory.citations = \"off\"<\/code> to keep<br \/>\n  the path metadata internal (the agent still receives the path for<br \/>\n  <code>memory_get<\/code>, but the snippet text omits the footer and the system prompt<br \/>\n  warns the agent not to cite it).<\/li>\n<\/ul>\n<h3>QMD example<\/h3>\n<p><code>``json5  theme={\"theme\":{\"light\":\"min-light\",\"dark\":\"min-dark\"}}<br \/>\nmemory: {<br \/>\n  backend: \"qmd\",<br \/>\n  citations: \"auto\",<br \/>\n  qmd: {<br \/>\n    includeDefaultMemory: true,<br \/>\n    update: { interval: \"5m\", debounceMs: 15000 },<br \/>\n    limits: { maxResults: 6, timeoutMs: 4000 },<br \/>\n    scope: {<br \/>\n      default: \"deny\",<br \/>\n      rules: [<br \/>\n        { action: \"allow\", match: { chatType: \"direct\" } },<br \/>\n        \/\/ Normalized session-key prefix (strips<\/code>agent::<code>).<br \/>\n        { action: \"deny\", match: { keyPrefix: \"discord:channel:\" } },<br \/>\n        \/\/ Raw session-key prefix (includes<\/code>agent::`).<br \/>\n        { action: &#8220;deny&#8221;, match: { rawKeyPrefix: &#8220;agent:main:discord:&#8221; } },<br \/>\n      ]<br \/>\n    },<br \/>\n    paths: [<br \/>\n      { name: &#8220;docs&#8221;, path: &#8220;~\/notes&#8221;, pattern: &#8220;*<em>\/<\/em>.md&#8221; }<br \/>\n    ]<br \/>\n  }<br \/>\n}<\/p>\n<pre><code>\n### Citations and fallback\n\n* `memory.citations` applies regardless of backend (`auto`\/`on`\/`off`).\n* When `qmd` runs, we tag `status().backend = &quot;qmd&quot;` so diagnostics show which\n  engine served the results. If the QMD subprocess exits or JSON output can't be\n  parsed, the search manager logs a warning and returns the builtin provider\n  (existing Markdown embeddings) until QMD recovers.\n\n## Additional memory paths\n\nIf you want to index Markdown files outside the default workspace layout, add\nexplicit paths:\n\n```json5  theme={&quot;theme&quot;:{&quot;light&quot;:&quot;min-light&quot;,&quot;dark&quot;:&quot;min-dark&quot;}}\nagents: {\n  defaults: {\n    memorySearch: {\n      extraPaths: [&quot;..\/team-docs&quot;, &quot;\/srv\/shared-notes\/overview.md&quot;]\n    }\n  }\n}\n<\/code><\/pre>\n<p>Notes:<\/p>\n<ul>\n<li>Paths can be absolute or workspace-relative.<\/li>\n<li>Directories are scanned recursively for <code>.md<\/code> files.<\/li>\n<li>By default, only Markdown files are indexed.<\/li>\n<li>If <code>memorySearch.multimodal.enabled = true<\/code>, OpenClaw also indexes supported image\/audio files under <code>extraPaths<\/code> only. Default memory roots (<code>MEMORY.md<\/code>, <code>memory.md<\/code>, <code>memory\/**\/*.md<\/code>) stay Markdown-only.<\/li>\n<li>Symlinks are ignored (files or directories).<\/li>\n<\/ul>\n<h2>Multimodal memory files (Gemini image + audio)<\/h2>\n<p>OpenClaw can index image and audio files from <code>memorySearch.extraPaths<\/code> when using Gemini embedding 2:<\/p>\n<p>&#8220;`json5  theme={&#8220;theme&#8221;:{&#8220;light&#8221;:&#8221;min-light&#8221;,&#8221;dark&#8221;:&#8221;min-dark&#8221;}}<br \/>\nagents: {<br \/>\n  defaults: {<br \/>\n    memorySearch: {<br \/>\n      provider: &#8220;gemini&#8221;,<br \/>\n      model: &#8220;gemini-embedding-2-preview&#8221;,<br \/>\n      extraPaths: [&#8220;assets\/reference&#8221;, &#8220;voice-notes&#8221;],<br \/>\n      multimodal: {<br \/>\n        enabled: true,<br \/>\n        modalities: [&#8220;image&#8221;, &#8220;audio&#8221;], \/\/ or [&#8220;all&#8221;]<br \/>\n        maxFileBytes: 10000000<br \/>\n      },<br \/>\n      remote: {<br \/>\n        apiKey: &#8220;YOUR_GEMINI_API_KEY&#8221;<br \/>\n      }<br \/>\n    }<br \/>\n  }<br \/>\n}<\/p>\n<pre><code>\nNotes:\n\n* Multimodal memory is currently supported only for `gemini-embedding-2-preview`.\n* Multimodal indexing applies only to files discovered through `memorySearch.extraPaths`.\n* Supported modalities in this phase: image and audio.\n* `memorySearch.fallback` must stay `&quot;none&quot;` while multimodal memory is enabled.\n* Matching image\/audio file bytes are uploaded to the configured Gemini embedding endpoint during indexing.\n* Supported image extensions: `.jpg`, `.jpeg`, `.png`, `.webp`, `.gif`, `.heic`, `.heif`.\n* Supported audio extensions: `.mp3`, `.wav`, `.ogg`, `.opus`, `.m4a`, `.aac`, `.flac`.\n* Search queries remain text, but Gemini can compare those text queries against indexed image\/audio embeddings.\n* `memory_get` still reads Markdown only; binary files are searchable but not returned as raw file contents.\n\n## Gemini embeddings (native)\n\nSet the provider to `gemini` to use the Gemini embeddings API directly:\n\n```json5  theme={&quot;theme&quot;:{&quot;light&quot;:&quot;min-light&quot;,&quot;dark&quot;:&quot;min-dark&quot;}}\nagents: {\n  defaults: {\n    memorySearch: {\n      provider: &quot;gemini&quot;,\n      model: &quot;gemini-embedding-001&quot;,\n      remote: {\n        apiKey: &quot;YOUR_GEMINI_API_KEY&quot;\n      }\n    }\n  }\n}\n<\/code><\/pre>\n<p>Notes:<\/p>\n<ul>\n<li><code>remote.baseUrl<\/code> is optional (defaults to the Gemini API base URL).<\/li>\n<li><code>remote.headers<\/code> lets you add extra headers if needed.<\/li>\n<li>Default model: <code>gemini-embedding-001<\/code>.<\/li>\n<li><code>gemini-embedding-2-preview<\/code> is also supported: 8192 token limit and configurable dimensions (768 \/ 1536 \/ 3072, default 3072).<\/li>\n<\/ul>\n<h3>Gemini Embedding 2 (preview)<\/h3>\n<p>&#8220;`json5  theme={&#8220;theme&#8221;:{&#8220;light&#8221;:&#8221;min-light&#8221;,&#8221;dark&#8221;:&#8221;min-dark&#8221;}}<br \/>\nagents: {<br \/>\n  defaults: {<br \/>\n    memorySearch: {<br \/>\n      provider: &#8220;gemini&#8221;,<br \/>\n      model: &#8220;gemini-embedding-2-preview&#8221;,<br \/>\n      outputDimensionality: 3072,  \/\/ optional: 768, 1536, or 3072 (default)<br \/>\n      remote: {<br \/>\n        apiKey: &#8220;YOUR_GEMINI_API_KEY&#8221;<br \/>\n      }<br \/>\n    }<br \/>\n  }<br \/>\n}<\/p>\n<pre><code>\n&gt; **Re-index required:** Switching from `gemini-embedding-001` (768 dimensions)\n&gt; to `gemini-embedding-2-preview` (3072 dimensions) changes the vector size. The same is true if you\n&gt; change `outputDimensionality` between 768, 1536, and 3072.\n&gt; OpenClaw will automatically reindex when it detects a model or dimension change.\n\n## Custom OpenAI-compatible endpoint\n\nIf you want to use a custom OpenAI-compatible endpoint (OpenRouter, vLLM, or a proxy),\nyou can use the `remote` configuration with the OpenAI provider:\n\n```json5  theme={&quot;theme&quot;:{&quot;light&quot;:&quot;min-light&quot;,&quot;dark&quot;:&quot;min-dark&quot;}}\nagents: {\n  defaults: {\n    memorySearch: {\n      provider: &quot;openai&quot;,\n      model: &quot;text-embedding-3-small&quot;,\n      remote: {\n        baseUrl: &quot;https:\/\/api.example.com\/v1\/&quot;,\n        apiKey: &quot;YOUR_OPENAI_COMPAT_API_KEY&quot;,\n        headers: { &quot;X-Custom-Header&quot;: &quot;value&quot; }\n      }\n    }\n  }\n}\n<\/code><\/pre>\n<p>If you don&#8217;t want to set an API key, use <code>memorySearch.provider = \"local\"<\/code> or set<br \/>\n<code>memorySearch.fallback = \"none\"<\/code>.<\/p>\n<h3>Fallbacks<\/h3>\n<ul>\n<li><code>memorySearch.fallback<\/code> can be <code>openai<\/code>, <code>gemini<\/code>, <code>voyage<\/code>, <code>mistral<\/code>, <code>ollama<\/code>, <code>local<\/code>, or <code>none<\/code>.<\/li>\n<li>The fallback provider is only used when the primary embedding provider fails.<\/li>\n<\/ul>\n<h3>Batch indexing (OpenAI + Gemini + Voyage)<\/h3>\n<ul>\n<li>Disabled by default. Set <code>agents.defaults.memorySearch.remote.batch.enabled = true<\/code> to enable for large-corpus indexing (OpenAI, Gemini, and Voyage).<\/li>\n<li>Default behavior waits for batch completion; tune <code>remote.batch.wait<\/code>, <code>remote.batch.pollIntervalMs<\/code>, and <code>remote.batch.timeoutMinutes<\/code> if needed.<\/li>\n<li>Set <code>remote.batch.concurrency<\/code> to control how many batch jobs we submit in parallel (default: 2).<\/li>\n<li>Batch mode applies when <code>memorySearch.provider = \"openai\"<\/code> or <code>\"gemini\"<\/code> and uses the corresponding API key.<\/li>\n<li>Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability.<\/li>\n<\/ul>\n<p>Why OpenAI batch is fast and cheap:<\/p>\n<ul>\n<li>For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.<\/li>\n<li>OpenAI offers discounted pricing for Batch API workloads, so large indexing runs are usually cheaper than sending the same requests synchronously.<\/li>\n<li>See the OpenAI Batch API docs and pricing for details:<\/li>\n<li><a href=\"https:\/\/platform.openai.com\/docs\/api-reference\/batch\">https:\/\/platform.openai.com\/docs\/api-reference\/batch<\/a><\/li>\n<li><a href=\"https:\/\/platform.openai.com\/pricing\">https:\/\/platform.openai.com\/pricing<\/a><\/li>\n<\/ul>\n<p>Config example:<\/p>\n<p>&#8220;`json5  theme={&#8220;theme&#8221;:{&#8220;light&#8221;:&#8221;min-light&#8221;,&#8221;dark&#8221;:&#8221;min-dark&#8221;}}<br \/>\nagents: {<br \/>\n  defaults: {<br \/>\n    memorySearch: {<br \/>\n      provider: &#8220;openai&#8221;,<br \/>\n      model: &#8220;text-embedding-3-small&#8221;,<br \/>\n      fallback: &#8220;openai&#8221;,<br \/>\n      remote: {<br \/>\n        batch: { enabled: true, concurrency: 2 }<br \/>\n      },<br \/>\n      sync: { watch: true }<br \/>\n    }<br \/>\n  }<br \/>\n}<\/p>\n<pre><code>\n## How the memory tools work\n\n* `memory_search` semantically searches Markdown chunks (~400 token target, 80-token overlap) from `MEMORY.md` + `memory\/**\/*.md`. It returns snippet text (capped ~700 chars), file path, line range, score, provider\/model, and whether we fell back from local to remote embeddings. No full file payload is returned.\n* `memory_get` reads a specific memory Markdown file (workspace-relative), optionally from a starting line and for N lines. Paths outside `MEMORY.md` \/ `memory\/` are rejected.\n* Both tools are enabled only when `memorySearch.enabled` resolves true for the agent.\n\n## What gets indexed (and when)\n\n* File type: Markdown only (`MEMORY.md`, `memory\/**\/*.md`).\n* Index storage: per-agent SQLite at `~\/.openclaw\/memory\/&lt;agentId&gt;.sqlite` (configurable via `agents.defaults.memorySearch.store.path`, supports `{agentId}` token).\n* Freshness: watcher on `MEMORY.md` + `memory\/` marks the index dirty (debounce 1.5s). Sync is scheduled on session start, on search, or on an interval and runs asynchronously. Session transcripts use delta thresholds to trigger background sync.\n* Reindex triggers: the index stores the embedding **provider\/model + endpoint fingerprint + chunking params**. If any of those change, OpenClaw automatically resets and reindexes the entire store.\n\n## Hybrid search (BM25 + vector)\n\nWhen enabled, OpenClaw combines:\n\n* **Vector similarity** (semantic match, wording can differ)\n* **BM25 keyword relevance** (exact tokens like IDs, env vars, code symbols)\n\nIf full-text search is unavailable on your platform, OpenClaw falls back to vector-only search.\n\n### Why hybrid\n\nVector search is great at &quot;this means the same thing&quot;:\n\n* &quot;Mac Studio gateway host&quot; vs &quot;the machine running the gateway&quot;\n* &quot;debounce file updates&quot; vs &quot;avoid indexing on every write&quot;\n\nBut it can be weak at exact, high-signal tokens:\n\n* IDs (`a828e60`, `b3b9895a...`)\n* code symbols (`memorySearch.query.hybrid`)\n* error strings (&quot;sqlite-vec unavailable&quot;)\n\nBM25 (full-text) is the opposite: strong at exact tokens, weaker at paraphrases.\nHybrid search is the pragmatic middle ground: **use both retrieval signals** so you get\ngood results for both &quot;natural language&quot; queries and &quot;needle in a haystack&quot; queries.\n\n### How we merge results (the current design)\n\nImplementation sketch:\n\n1. Retrieve a candidate pool from both sides:\n\n* **Vector**: top `maxResults * candidateMultiplier` by cosine similarity.\n* **BM25**: top `maxResults * candidateMultiplier` by FTS5 BM25 rank (lower is better).\n\n2. Convert BM25 rank into a 0..1-ish score:\n\n* `textScore = 1 \/ (1 + max(0, bm25Rank))`\n\n3. Union candidates by chunk id and compute a weighted score:\n\n* `finalScore = vectorWeight * vectorScore + textWeight * textScore`\n\nNotes:\n\n* `vectorWeight` + `textWeight` is normalized to 1.0 in config resolution, so weights behave as percentages.\n* If embeddings are unavailable (or the provider returns a zero-vector), we still run BM25 and return keyword matches.\n* If FTS5 can't be created, we keep vector-only search (no hard failure).\n\nThis isn't &quot;IR-theory perfect&quot;, but it's simple, fast, and tends to improve recall\/precision on real notes.\nIf we want to get fancier later, common next steps are Reciprocal Rank Fusion (RRF) or score normalization\n(min\/max or z-score) before mixing.\n\n### Post-processing pipeline\n\nAfter merging vector and keyword scores, two optional post-processing stages\nrefine the result list before it reaches the agent:\n\n<\/code><\/pre>\n<p>Vector + Keyword -&gt; Weighted Merge -&gt; Temporal Decay -&gt; Sort -&gt; MMR -&gt; Top-K Results<\/p>\n<pre><code>\nBoth stages are **off by default** and can be enabled independently.\n\n### MMR re-ranking (diversity)\n\nWhen hybrid search returns results, multiple chunks may contain similar or overlapping content.\nFor example, searching for &quot;home network setup&quot; might return five nearly identical snippets\nfrom different daily notes that all mention the same router configuration.\n\n**MMR (Maximal Marginal Relevance)** re-ranks the results to balance relevance with diversity,\nensuring the top results cover different aspects of the query instead of repeating the same information.\n\nHow it works:\n\n1. Results are scored by their original relevance (vector + BM25 weighted score).\n2. MMR iteratively selects results that maximize: `lambda x relevance - (1-lambda) x max_similarity_to_selected`.\n3. Similarity between results is measured using Jaccard text similarity on tokenized content.\n\nThe `lambda` parameter controls the trade-off:\n\n* `lambda = 1.0` -- pure relevance (no diversity penalty)\n* `lambda = 0.0` -- maximum diversity (ignores relevance)\n* Default: `0.7` (balanced, slight relevance bias)\n\n**Example -- query: &quot;home network setup&quot;**\n\nGiven these memory files:\n\n<\/code><\/pre>\n<p>memory\/2026-02-10.md  -&gt; &#8220;Configured Omada router, set VLAN 10 for IoT devices&#8221;<br \/>\nmemory\/2026-02-08.md  -&gt; &#8220;Configured Omada router, moved IoT to VLAN 10&#8221;<br \/>\nmemory\/2026-02-05.md  -&gt; &#8220;Set up AdGuard DNS on 192.168.10.2&#8221;<br \/>\nmemory\/network.md     -&gt; &#8220;Router: Omada ER605, AdGuard: 192.168.10.2, VLAN 10: IoT&#8221;<\/p>\n<pre><code>\nWithout MMR -- top 3 results:\n\n<\/code><\/pre>\n<ol>\n<li>memory\/2026-02-10.md  (score: 0.92)  &lt;- router + VLAN<\/li>\n<li>memory\/2026-02-08.md  (score: 0.89)  &lt;- router + VLAN (near-duplicate!)<\/li>\n<li>memory\/network.md     (score: 0.85)  &lt;- reference doc<\/li>\n<\/ol>\n<pre><code>\nWith MMR (lambda=0.7) -- top 3 results:\n\n<\/code><\/pre>\n<ol>\n<li>memory\/2026-02-10.md  (score: 0.92)  &lt;- router + VLAN<\/li>\n<li>memory\/network.md     (score: 0.85)  &lt;- reference doc (diverse!)<\/li>\n<li>memory\/2026-02-05.md  (score: 0.78)  &lt;- AdGuard DNS (diverse!)<\/li>\n<\/ol>\n<pre><code>\nThe near-duplicate from Feb 8 drops out, and the agent gets three distinct pieces of information.\n\n**When to enable:** If you notice `memory_search` returning redundant or near-duplicate snippets,\nespecially with daily notes that often repeat similar information across days.\n\n### Temporal decay (recency boost)\n\nAgents with daily notes accumulate hundreds of dated files over time. Without decay,\na well-worded note from six months ago can outrank yesterday's update on the same topic.\n\n**Temporal decay** applies an exponential multiplier to scores based on the age of each result,\nso recent memories naturally rank higher while old ones fade:\n\n<\/code><\/pre>\n<p>decayedScore = score x e^(-lambda x ageInDays)<\/p>\n<pre><code>\nwhere `lambda = ln(2) \/ halfLifeDays`.\n\nWith the default half-life of 30 days:\n\n* Today's notes: **100%** of original score\n* 7 days ago: **~84%**\n* 30 days ago: **50%**\n* 90 days ago: **12.5%**\n* 180 days ago: **~1.6%**\n\n**Evergreen files are never decayed:**\n\n* `MEMORY.md` (root memory file)\n* Non-dated files in `memory\/` (e.g., `memory\/projects.md`, `memory\/network.md`)\n* These contain durable reference information that should always rank normally.\n\n**Dated daily files** (`memory\/YYYY-MM-DD.md`) use the date extracted from the filename.\nOther sources (e.g., session transcripts) fall back to file modification time (`mtime`).\n\n**Example -- query: &quot;what's Rod's work schedule?&quot;**\n\nGiven these memory files (today is Feb 10):\n\n<\/code><\/pre>\n<p>memory\/2025-09-15.md  -&gt; &#8220;Rod works Mon-Fri, standup at 10am, pairing at 2pm&#8221;  (148 days old)<br \/>\nmemory\/2026-02-10.md  -&gt; &#8220;Rod has standup at 14:15, 1:1 with Zeb at 14:45&#8221;    (today)<br \/>\nmemory\/2026-02-03.md  -&gt; &#8220;Rod started new team, standup moved to 14:15&#8221;        (7 days old)<\/p>\n<pre><code>\nWithout decay:\n\n<\/code><\/pre>\n<ol>\n<li>memory\/2025-09-15.md  (score: 0.91)  &lt;- best semantic match, but stale!<\/li>\n<li>memory\/2026-02-10.md  (score: 0.82)<\/li>\n<li>memory\/2026-02-03.md  (score: 0.80)<\/li>\n<\/ol>\n<pre><code>\nWith decay (halfLife=30):\n\n<\/code><\/pre>\n<ol>\n<li>memory\/2026-02-10.md  (score: 0.82 x 1.00 = 0.82)  &lt;- today, no decay<\/li>\n<li>memory\/2026-02-03.md  (score: 0.80 x 0.85 = 0.68)  &lt;- 7 days, mild decay<\/li>\n<li>memory\/2025-09-15.md  (score: 0.91 x 0.03 = 0.03)  &lt;- 148 days, nearly gone<\/li>\n<\/ol>\n<pre><code>\nThe stale September note drops to the bottom despite having the best raw semantic match.\n\n**When to enable:** If your agent has months of daily notes and you find that old,\nstale information outranks recent context. A half-life of 30 days works well for\ndaily-note-heavy workflows; increase it (e.g., 90 days) if you reference older notes frequently.\n\n### Hybrid search configuration\n\nBoth features are configured under `memorySearch.query.hybrid`:\n\n```json5  theme={&quot;theme&quot;:{&quot;light&quot;:&quot;min-light&quot;,&quot;dark&quot;:&quot;min-dark&quot;}}\nagents: {\n  defaults: {\n    memorySearch: {\n      query: {\n        hybrid: {\n          enabled: true,\n          vectorWeight: 0.7,\n          textWeight: 0.3,\n          candidateMultiplier: 4,\n          \/\/ Diversity: reduce redundant results\n          mmr: {\n            enabled: true,    \/\/ default: false\n            lambda: 0.7       \/\/ 0 = max diversity, 1 = max relevance\n          },\n          \/\/ Recency: boost newer memories\n          temporalDecay: {\n            enabled: true,    \/\/ default: false\n            halfLifeDays: 30  \/\/ score halves every 30 days\n          }\n        }\n      }\n    }\n  }\n}\n<\/code><\/pre>\n<p>You can enable either feature independently:<\/p>\n<ul>\n<li><strong>MMR only<\/strong> &#8212; useful when you have many similar notes but age doesn&#8217;t matter.<\/li>\n<li><strong>Temporal decay only<\/strong> &#8212; useful when recency matters but your results are already diverse.<\/li>\n<li><strong>Both<\/strong> &#8212; recommended for agents with large, long-running daily note histories.<\/li>\n<\/ul>\n<h2>Embedding cache<\/h2>\n<p>OpenClaw can cache <strong>chunk embeddings<\/strong> in SQLite so reindexing and frequent updates (especially session transcripts) don&#8217;t re-embed unchanged text.<\/p>\n<p>Config:<\/p>\n<p>&#8220;`json5  theme={&#8220;theme&#8221;:{&#8220;light&#8221;:&#8221;min-light&#8221;,&#8221;dark&#8221;:&#8221;min-dark&#8221;}}<br \/>\nagents: {<br \/>\n  defaults: {<br \/>\n    memorySearch: {<br \/>\n      cache: {<br \/>\n        enabled: true,<br \/>\n        maxEntries: 50000<br \/>\n      }<br \/>\n    }<br \/>\n  }<br \/>\n}<\/p>\n<pre><code>\n## Session memory search (experimental)\n\nYou can optionally index **session transcripts** and surface them via `memory_search`.\nThis is gated behind an experimental flag.\n\n```json5  theme={&quot;theme&quot;:{&quot;light&quot;:&quot;min-light&quot;,&quot;dark&quot;:&quot;min-dark&quot;}}\nagents: {\n  defaults: {\n    memorySearch: {\n      experimental: { sessionMemory: true },\n      sources: [&quot;memory&quot;, &quot;sessions&quot;]\n    }\n  }\n}\n<\/code><\/pre>\n<p>Notes:<\/p>\n<ul>\n<li>Session indexing is <strong>opt-in<\/strong> (off by default).<\/li>\n<li>Session updates are debounced and <strong>indexed asynchronously<\/strong> once they cross delta thresholds (best-effort).<\/li>\n<li><code>memory_search<\/code> never blocks on indexing; results can be slightly stale until background sync finishes.<\/li>\n<li>Results still include snippets only; <code>memory_get<\/code> remains limited to memory files.<\/li>\n<li>Session indexing is isolated per agent (only that agent&#8217;s session logs are indexed).<\/li>\n<li>Session logs live on disk (<code>~\/.openclaw\/agents\/&lt;agentId&gt;\/sessions\/*.jsonl<\/code>). Any process\/user with filesystem access can read them, so treat disk access as the trust boundary. For stricter isolation, run agents under separate OS users or hosts.<\/li>\n<\/ul>\n<p>Delta thresholds (defaults shown):<\/p>\n<p>&#8220;`json5  theme={&#8220;theme&#8221;:{&#8220;light&#8221;:&#8221;min-light&#8221;,&#8221;dark&#8221;:&#8221;min-dark&#8221;}}<br \/>\nagents: {<br \/>\n  defaults: {<br \/>\n    memorySearch: {<br \/>\n      sync: {<br \/>\n        sessions: {<br \/>\n          deltaBytes: 100000,   \/\/ ~100 KB<br \/>\n          deltaMessages: 50     \/\/ JSONL lines<br \/>\n        }<br \/>\n      }<br \/>\n    }<br \/>\n  }<br \/>\n}<\/p>\n<pre><code>\n## SQLite vector acceleration (sqlite-vec)\n\nWhen the sqlite-vec extension is available, OpenClaw stores embeddings in a\nSQLite virtual table (`vec0`) and performs vector distance queries in the\ndatabase. This keeps search fast without loading every embedding into JS.\n\nConfiguration (optional):\n\n```json5  theme={&quot;theme&quot;:{&quot;light&quot;:&quot;min-light&quot;,&quot;dark&quot;:&quot;min-dark&quot;}}\nagents: {\n  defaults: {\n    memorySearch: {\n      store: {\n        vector: {\n          enabled: true,\n          extensionPath: &quot;\/path\/to\/sqlite-vec&quot;\n        }\n      }\n    }\n  }\n}\n<\/code><\/pre>\n<p>Notes:<\/p>\n<ul>\n<li><code>enabled<\/code> defaults to true; when disabled, search falls back to in-process<br \/>\n  cosine similarity over stored embeddings.<\/li>\n<li>If the sqlite-vec extension is missing or fails to load, OpenClaw logs the<br \/>\n  error and continues with the JS fallback (no vector table).<\/li>\n<li><code>extensionPath<\/code> overrides the bundled sqlite-vec path (useful for custom builds<br \/>\n  or non-standard install locations).<\/li>\n<\/ul>\n<h2>Local embedding auto-download<\/h2>\n<ul>\n<li>Default local embedding model: <code>hf:ggml-org\/embeddinggemma-300m-qat-q8_0-GGUF\/embeddinggemma-300m-qat-Q8_0.gguf<\/code> (~0.6 GB).<\/li>\n<li>When <code>memorySearch.provider = \"local\"<\/code>, <code>node-llama-cpp<\/code> resolves <code>modelPath<\/code>; if the GGUF is missing it <strong>auto-downloads<\/strong> to the cache (or <code>local.modelCacheDir<\/code> if set), then loads it. Downloads resume on retry.<\/li>\n<li>Native build requirement: run <code>pnpm approve-builds<\/code>, pick <code>node-llama-cpp<\/code>, then <code>pnpm rebuild node-llama-cpp<\/code>.<\/li>\n<li>Fallback: if local setup fails and <code>memorySearch.fallback = \"openai\"<\/code>, we automatically switch to remote embeddings (<code>openai\/text-embedding-3-small<\/code> unless overridden) and record the reason.<\/li>\n<\/ul>\n<h2>Custom OpenAI-compatible endpoint example<\/h2>\n<p><code>json5  theme={\"theme\":{\"light\":\"min-light\",\"dark\":\"min-dark\"}}<br \/>\nagents: {<br \/>\n  defaults: {<br \/>\n    memorySearch: {<br \/>\n      provider: \"openai\",<br \/>\n      model: \"text-embedding-3-small\",<br \/>\n      remote: {<br \/>\n        baseUrl: \"https:\/\/api.example.com\/v1\/\",<br \/>\n        apiKey: \"YOUR_REMOTE_API_KEY\",<br \/>\n        headers: {<br \/>\n          \"X-Organization\": \"org-id\",<br \/>\n          \"X-Project\": \"project-id\"<br \/>\n        }<br \/>\n      }<br \/>\n    }<br \/>\n  }<br \/>\n}<\/code><\/p>\n<p>Notes:<\/p>\n<ul>\n<li><code>remote.*<\/code> takes precedence over <code>models.providers.openai.*<\/code>.<\/li>\n<li><code>remote.headers<\/code> merge with OpenAI headers; remote wins on key conflicts. Omit <code>remote.headers<\/code> to use the OpenAI defaults.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Memory configuration reference This page covers the ful [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-633","post","type-post","status-publish","format-standard","hentry","category-docs"],"_links":{"self":[{"href":"https:\/\/pa.yingzhi8.cn\/index.php\/wp-json\/wp\/v2\/posts\/633","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pa.yingzhi8.cn\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/pa.yingzhi8.cn\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/pa.yingzhi8.cn\/index.php\/wp-json\/wp\/v2\/comments?post=633"}],"version-history":[{"count":4,"href":"https:\/\/pa.yingzhi8.cn\/index.php\/wp-json\/wp\/v2\/posts\/633\/revisions"}],"predecessor-version":[{"id":899,"href":"https:\/\/pa.yingzhi8.cn\/index.php\/wp-json\/wp\/v2\/posts\/633\/revisions\/899"}],"wp:attachment":[{"href":"https:\/\/pa.yingzhi8.cn\/index.php\/wp-json\/wp\/v2\/media?parent=633"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/pa.yingzhi8.cn\/index.php\/wp-json\/wp\/v2\/categories?post=633"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/pa.yingzhi8.cn\/index.php\/wp-json\/wp\/v2\/tags?post=633"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}