Docs / Introduction

Documentation · v0.0.2

Self-hostable AI memory for production agents.

Your agents remember things. Over time they remember too much, get confused, and cost a fortune. MemGC throws away the old stuff intelligently — so your agents stay sharp, fast, and cheap. Runs on your own server.

Start in 2 minutes Read the API reference v0.0.2 live on PyPI

What is MemGC?

MemGC is a memory layer for AI agents that takes forgetting seriously. Most memory systems focus on what to remember — MemGC focuses on what to throw away, mathematically and predictably, so your agent's context never bloats into hallucination territory or runaway token bills.

It's a Rust core with Python bindings, ships as a 4.8 MB wheel, and runs entirely on your own server. Your data never touches our infrastructure — there isn't any.

How a turn flows through the four core operations

📥

extract()

turn → atomic Memory

→

🔍

retrieve()

vector + BM25 hybrid

→

📜

consolidate()

8k log → 140-tok YAML

→

🗑️

sweep()

score-based GC

Extract distills, retrieve finds, consolidate compresses, sweep forgets — together they keep memory bounded.

Who is it for?

Engineering teams running AI agents in production. Anyone whose agent's context window keeps growing, whose token bill keeps climbing, or whose compliance review asks the question "show me what your agent knew on Tuesday at 3pm." If you've ever been told "just summarize the history" and felt that's not a real answer — MemGC is for you.

How it works

Five capabilities make MemGC different from a generic memory store:

Calibrated decay

Every memory has a six-component score that determines its survival on the next sweep: frequency (recall count), relevance (importance),{' '} diversity (distinct query contexts), recency{' '} (half-life decay), consolidation (spaced-repetition days), and{' '} conceptual (tag coverage). Weights sum to 1.0 by default; rows whose weighted score falls below the survival threshold get deleted on the next sweep. Math decides what stays — not an LLM judge.

Recall

recall() is the single retrieval entry point. The bound LLM rewrites the query into 2-3 angles, then for each rewrite memgc fans out vector cosine + BM25 lexical concurrently via tokio::join!. The per-query result lists are fused via N-list RRF, blended with raw cosine for a final re-rank, and (optionally) spliced with the consolidated profile at rank 1. A row hit by either channel still ranks; an exact-keyword match rescues a semantic miss, and vice versa. Borrowed in spirit from Mem0's utils/scoring.py.

Audit & lineage

Every Memory row carries a (status, version, lineage_id){' '} triple. Updates produce a new row with version + 1 and the prior row flipped to archived; history(lineage_id) returns the full chain. Plus an opt-in persistent SQL operations log:{' '} with_persistent_audit_log() records every extract /{' '} retrieve / consolidate / sweep to the{' '} memgc_audit_log table. SOC 2 / GDPR friendly.{' '} Read more →

Injection safety

Six prompt-injection defenses no other memory library ships:

Six-pattern reject list on extract() — blocks the openclaw paper's known injection shapes before storage.
<relevant-memories> wrap on consolidate() with closing-tag escape, so retrieved content can't break out into the system prompt.
Query sanitizer on retrieve() — strips prepended system-prompt noise from over-long queries (a 200-char cap that recovered 89% R@10 on mempalace's audit).
SHA-1 verbatim dedup via find_by_hash + a partial unique index — race-safe under concurrent writers.
Middle-truncation pre-pass on consolidation payloads above 50k tokens, with UTF-8 codepoint awareness for CJK and emoji.
Audit-by-default lineage — every change is recoverable.

Storage backends

Three first-class adapters share one trait:

SqliteStorage — local file or in-memory; auto-migrates; the production-ready zero-deps default.
PostgresStorage — pgvector with native cosine distance, ivfflat index (lists=100 default, hnsw on the migration path), runtime-templated vector(N) column for the caller's embedder.
InMemoryStorage — for tests and quick demos; supports brute-force cosine vector search so the LongMemEval harness doesn't need a Postgres container.

All three pass the same ~70 contract tests, byte-identical. Swap by changing one line.

What makes it different?

The only Rust-core memory library. Mem0, MemPalace, Letta, MemOS — all Python. Rust gets you a 50× faster cold start, a single-binary deploy, and pgvector at production scale without re-architecture.
Calibrated decay you can tune. The 6-component score is documented, the half-life is documented, the weights are caller-tunable. No LLM-as-judge for "is this still relevant" — you get math.
Audit-trail-by-default. Every row has a version chain. history(lineage_id) answers "what did the agent know at any past moment." Add the persistent SQL audit log if you need a tamper-resistant operations record.
Six injection defenses — none of Mem0 / MemPalace / Letta / MemOS ship these.
BYO everything — storage, LLM, embedder. Three of each ship in-tree (Anthropic / OpenAI / Azure for LLM; OpenAI / Azure / Voyage for embedder), but the trait is open and the bring-your-own path is documented.
LongMemEval R@5 = 0.984 on the 450-question held-out split. Pure retrieval, no LLM rerank, single locked-config run, full disclosure.{' '} See the methodology →

Where to start

Pick the path that matches how you learn.

Quickstart

uv add memgc, open a handle, extract your first memory, run hybrid retrieval. ~5 minutes.

API reference

The MemGC handle, the Memory row, the Storage + Embedder traits — every public type.

LongMemEval R@5 = 0.984

The benchmark numbers, what was tuned, what was measured, full reproduction recipe.

Audit & lineage

Per-row version chains plus the opt-in operations log. SOC 2 / GDPR / forensic-ready.

GitHub

Read the source, file an issue, send a PR. Apache 2.0 across the workspace.

PyPI

v0.0.2 live now. uv add memgc works on macOS / Linux / Windows.

Design principles

Three ideas shape every product decision. When something in the API feels unusual, "why did they do it like that?" — these are the answer.

Math, not vibes

Forgetting is a math problem, not a judgment problem. MemGC computes a deterministic 6-component score for every row and deletes whatever falls below threshold. No LLM judge says "is this still relevant" — judges hallucinate, drift, and cost money. A documented formula doesn't.

Auditors, regulators, and your own future debugging self will thank you for a forgetting policy that runs identically every time. (weights, half_life){' '} in, deleted set out.

Stateless by construction

MemGC stores nothing of its own. Your data lives in your SQLite file or{' '} your Postgres database. MemGC manages a single table inside it (memgc_memories) and never touches the rest. Drop the database, the entire MemGC state is gone — there's no separate cache, no daemon, no remote endpoint phoning home.

BYO, three of each

Three storage backends (SqliteStorage, PostgresStorage,{' '} InMemoryStorage), three LLM clients (AnthropicClient,{' '} OpenAIClient, AzureOpenAIClient), three embedders (OpenAIEmbedder, AzureOpenAIEmbedder, VoyageEmbedder). Plus a mock for each. Custom backends implement the trait directly — no per-vendor integration glue lives in core.