Open source · Self-hosted · Free forever

Stop living the
same session twice

A self-hosted MCP server that gives Claude Code persistent memory across sessions, projects, and machines. Everything runs on your hardware, and it's free.

The problem

Every Claude Code session starts from zero. It doesn't know your project. It doesn't remember what failed last week. It has no idea you spent three hours last Tuesday figuring out why onnxruntime crashes on Alpine, only to find something that actually works.

So you explain everything again. Claude suggests the same broken library again. Same alarm, same song. It's Groundhog Day, you're Bill Murray, and Claude is Punxsutawney.

$ claude
claude> use onnxruntime for the embeddings

... 45 minutes of debugging later ...

SIGILL: illegal instruction (Alpine musl libc)

You fixed this exact problem last Tuesday.
Claude doesn't know that.

$ claude # with OmniMem
claude> use onnxruntime for the embeddings

WARNING: previously abandoned approach
onnxruntime - SIGILL crash on Alpine musl libc (effort: 4/5)
Switched to sentence-transformers instead.

See it in action

A single recall() query pulls relevant context from everywhere OmniMem knows about. Personal preferences, ingested articles, past conversations, project context. It's all ranked together and merged into one useful response.

OmniMem recall in action showing multi-source memory retrieval in Claude Code

Real output from a recall query about markdown editors, annotated to show where each piece of knowledge came from.

Episodic memory

Picked up from your conversations. Your preferences, decisions, and things you mentioned in passing.

Knowledge base

Pulled from RSS articles that were auto-summarised and embedded. They surface when they're relevant to what you're asking.

External references

Gathered from links and posts you've shared. GitHub threads, blog posts, anything worth holding onto.

Project context

Brings in extra context it knows matters to you, like your preferred tools and platforms.

What makes OmniMem different

This isn't just another key-value store with an MCP wrapper. OmniMem tries to model how memory actually works. Things fade over time, they sometimes contradict each other, and the hard-won stuff earns its place.

The Graveyard

Every dead-end gets logged: what you tried, what type it was, why it failed, and how much time you burned on it. Before Claude suggests a library or pattern, it checks the graveyard first. You won't waste another afternoon on something you've already ruled out.

Experience scoring

Something that worked first time is handy. But something that took four attempts, two abandoned libraries, and a weird platform workaround to crack? That's gold. The harder it was to figure out, the more prominently it surfaces next time.

Contradiction detection

If a new memory disagrees with something already stored, OmniMem catches it and warns you. There's a fast check on every write, and an optional deeper analysis powered by Claude that you can run on demand. Conflicting memories get linked together so you can sort them out.

Semantic deduplication

When you store something new, it gets compared against what's already there. If it's too similar to an existing memory, you'll get a heads-up instead of a duplicate. You can also run find_duplicates to scan everything and clean up in bulk.

Three memory namespaces

Episodic for your decisions, bugs, and patterns. Project for your stack, goals, and current state. Knowledge for RSS articles auto-summarised by Claude Haiku. All three get searched together whenever you recall something.

One-call briefing

One call to briefing() and Claude gets everything it needs: project context, experience stats, stale memories, new articles, contradictions, and anything that might need reinstating. No more three-step warm-up at the start of every session.

Memory is not binary

Most memory systems either remember something or delete it. OmniMem has a proper lifecycle. When you say "forget about X" you usually mean stop bringing it up, not wipe it from existence.

ACTIVE
1.0x weight
DEPRIORITISED
0.2x weight
ARCHIVED
0.0x weight
DELETED
gone

Deprioritised memories aren't gone for good. You can attach reinstate hints, and if a future query matches one, the memory comes back with a note explaining why it was pushed down in the first place. You can also mute entire topics across all your sessions.

score = similarity × surface_score × recency × experience_weight

Four factors decide what comes back. Semantic similarity on its own isn't enough. Lifecycle state, age, and how hard it was to figure out all play a role in the final ranking.

Effort Meaning Weight Boost
1 Worked first time 1.0x
2 Minor friction 1.1x
3 Multiple iterations 1.25x
4 Significant struggle 1.5x
5 Battle-hardened 1.8x

How it compares

Claude Code's built-in memory uses flat files with no semantic search. Most third-party MCP memory servers just store and retrieve. OmniMem goes quite a bit further.

Capability Claude built-in Typical MCP memory <OmniMem>
Semantic vector search No Yes Yes
Memory lifecycle states No No Yes, 4 states
Abandoned approach warnings No No Yes, graveyard
Experience scoring No No Yes, effort 1-5
Contradiction detection No No Yes, 2-tier
Semantic deduplication No Partial Yes, write + batch
Topic suppression No No Yes
RSS knowledge ingestion No No Yes, auto-summarised
Reinstate hints No No Yes
Self-hosted / no SaaS Local files Varies Yes, Docker
Multi-machine sync No Varies Yes, via proxy

Architecture

Three Docker containers, local embeddings, and nothing leaves your machine.

Claude Code (any machine) | | SSE / MCP protocol v +-------------------------------------------+ | OmniMem MCP Server | | Python · FastMCP · Debian slim | | | | remember recall deprioritise | | record_experience warn_if_abandoned | | find_duplicates check_contradictions | | briefing dump_to_file health | +---------+-----------------+---------------+ | | v v +-----------------+ +------------------+ | Valkey | | RSS Worker | | + search |<--| | | | | feedparser | | idx:episodic | | APScheduler | | idx:project | | Claude Haiku | | idx:knowledge | +------------------+ +-----------------+ Recall pipeline: query → abandoned fast-path → embed → vector search → filter archived/deleted → filter suppressed topics → apply surface_score → apply recency decay → apply experience_weight → check reinstate eligibility → surface contradictions → merge, re-rank, return top_k

Up and running in two minutes

01

Clone and configure

Pick a strong Valkey password. If you want RSS summaries and smarter contradiction detection, add your Anthropic API key too.

git clone https://codeberg.org/ric_harvey/omnimem.git
cd omnimem
cp .env.example .env
# edit .env: set VALKEY_PASSWORD and ANTHROPIC_API_KEY
02

Start the containers

This spins up three containers: Valkey with vector search, the MCP server, and the RSS worker.

docker compose up -d
03

Connect Claude Code

Point Claude Code at OmniMem in your MCP config, then drop the included CLAUDE.md into your project.

// ~/.claude.json or .mcp.json
{
  "mcpServers": {
    "omnimem": {
      "type": "sse",
      "url": "http://localhost:8765/sse"
    }
  }
}
Pro tip

Add this to your ~/.claude/CLAUDE.md to tell Claude Code to use OmniMem for all memory instead of its built-in file-based system. This works globally across every project.

## Memory
Always use the omnimem MCP tools (remember, recall, briefing,
etc.) for all memory storage instead of the file-based memory
system. Check the memory before returning an answer to see if
we have anything relevant that may help. When brainstorming
new projects or ideas check the memory for ingested news
stories that may be of interest for us to investigate.

The repo also includes a more detailed claude_config/CLAUDE.md you can drop into individual projects. It teaches Claude when to call briefing(), how to record experience scores, when to check the graveyard, and how to handle deprioritisation.