ONNX embedding models run locally via Xenova
ONNX embedding models run locally via Xenova Transformers
Local embeddings — via Xenova Transformers (ONNX), llama.cpp (GGUF), or a remote embedding API — so AI clients can recall memories by meaning, not just keywords.
Local embeddings — via Xenova Transformers (ONNX), llama.cpp (GGUF), or a remote embedding API — so AI clients can recall memories by meaning, not just keywords.
Keyword search misses the memory you wrote three months ago using different words. Semantic recall finds it by meaning, which is how your AI tools think about retrieval anyway.
ONNX embedding models run locally via Xenova Transformers
GGUF embedding models run via llama.cpp for higher quality on capable machines
Remote embedding API option if you want to offload
Semantic recall exposed to AI clients through the MCP `vault_search` tool
Open search in the app or have a connected AI client query the vault.
Enter a natural-language description of the memory you want, not just keywords.
Review results ranked by conceptual similarity and open the closest match.
Rephrase to steer toward a different angle of the same topic when needed.
No. The default ONNX model runs on CPU. GGUF models can use Metal on macOS or CUDA on Linux if you've set up llama.cpp accordingly.
1AIVault triggers a background backfill that re-embeds every entry with the new model. The vault stays usable while it runs; search quality blends old and new until the backfill completes.
Dashboard view shows a vault summary plus a live activity feed of every read, write, and classification across all connected AI tools. Deep links via `aivault://` open the app from anywhere.
Learn moreEvery entry detail header shows 'Last used X ago by Source · Nx this week' so you know which AI tools are touching a memory and how often.
Learn moreText search uses ranked matching so exact terms, titles, sources, and memory bodies are still easy to audit.
Learn moreStart free, import real conversations, and reuse your memory across every AI agent you already use.