A local-first, team-ready MCP server that provides a durable memory system for LLM-based coding workflows. It's optimized for JavaScript/TypeScript development (web and mobile), but works for any stack. Memory items can be stored globally, locally per project, or committed to the repo for team sharing — with fast search, ranking, and per-scope tuning.
- Three Storage Backends: Choose between file (JSON), video (QR+MP4 compression), or markdown (Obsidian-compatible)
- Revolutionary Video Storage: 50-100x compression through QR code + video encoding while maintaining sub-100ms search
- Markdown/Obsidian Integration: Native markdown format with YAML frontmatter and wikilinks for Obsidian vaults
- Local Embeddings & HNSW Search: Offline semantic search with transformers.js and O(log n) vector indexing - no external APIs required
- Automatic Backend Selection: Intelligent detection of capabilities with seamless migration between backends
- Flexible Storage Architecture: Switch between file, video, and markdown storage at any time
- New: Automatic Memory Discovery: MCP prompts check relevant memories before tasks (inspired by Claude's memory tool)
- New: Incremental Editing: Patch, append, and merge operations for efficient memory updates
- New: TTL Auto-Pruning: Automatic cleanup of expired memories with configurable time-to-live
- Unified Memory model: snippet, pattern, config, insight, runbook, fact, note
- Scopes: global (personal), local (per-project, uncommitted), committed (project/.llm-memory)
- Intelligent Confidence Scoring: Automatic quality assessment based on usage patterns, feedback, and time-based decay
- Fast search: BM25 scoring + boosts (scope, pin, recency, confidence) with phrase/title bonuses
- Hybrid Search: Combine keyword-based BM25 with semantic vector similarity for best-of-both-worlds retrieval
- User Feedback System: Record helpful/not helpful feedback to improve confidence scoring
- Optimized Journal System: Content-based hashing reduces journal storage by 81-95% with automatic migration
- Tuning via config.json per scope (field weights, bm25, boosts, confidence parameters)
- Atomic writes, journaling, and rebuildable index/catalog
- Secret redaction on ingestion (common API key patterns)
- MCP tools for authoring, curation, linking, and project management
Prerequisites:
- Node.js 18+
- pnpm 9+ (install with
npm install -g pnpm) - FFmpeg (optional): For video storage compression capabilities
git clone <repository-url>
cd llm-memory-mcp
pnpm install
pnpm run buildFor optimal storage efficiency with 50-100x compression, install FFmpeg:
macOS:
# Using Homebrew
brew install ffmpeg
# Using MacPorts
sudo port install ffmpegLinux (Ubuntu/Debian):
# Ubuntu/Debian
sudo apt update
sudo apt install ffmpeg
# Fedora/RHEL
sudo dnf install ffmpeg
# Arch Linux
sudo pacman -S ffmpegWindows:
# Using Chocolatey
choco install ffmpeg
# Using Scoop
scoop install ffmpegThe system automatically detects FFmpeg availability and enables video storage compression when available. Without FFmpeg, the system gracefully falls back to optimized file storage.
- Start the server
pnpm start- Configure in your MCP client
-
Claude Code
- Settings → Extensions → MCP Servers
- Name:
llm-memory - Command:
node - Args:
["/absolute/path/to/llm-memory-mcp/dist/index.js"]
-
Cursor
- Settings → Extensions → MCP
- Server name:
llm-memory - Command:
node - Arguments:
/absolute/path/to/llm-memory-mcp/dist/index.js
-
Codex CLI
codex config set mcp.servers.llm-mem.command "node"
codex config set mcp.servers.llm-mem.args "['/absolute/path/to/llm-memory-mcp/dist/index.js']"This repository includes a specialized agent (agents/dev-memory-manager.md) designed for intelligent development knowledge curation with Claude Code. The agent automatically captures critical context before conversation compacting, preserves development progress across sessions, and maintains a living knowledge base.
The dev-memory-manager agent provides:
- Context Preservation: Automatically saves work-in-progress before conversation limits are reached
- Session Continuity: Reconstructs previous conversation context when returning to ongoing work
- Knowledge Curation: Captures reusable patterns, insights, and technical decisions
- Progress Tracking: Maintains state of multi-session features and debugging journeys
- Smart Retrieval: Proactively surfaces relevant stored knowledge for current tasks
- Copy the agent file to your Claude Code agents directory:
# On macOS/Linux
cp agents/dev-memory-manager.md ~/.claude/agents/
# On Windows
copy agents\dev-memory-manager.md %USERPROFILE%\.claude\agents\-
Configure the LLM Memory MCP server (as shown in Quick Start above)
-
Restart Claude Code to load the new agent
The agent activates automatically when you:
- Approach context limits during complex development work
- Reference previous sessions or continue ongoing projects
- Start new features that might benefit from stored patterns
- Encounter problems that seem familiar or previously solved
Manual activation examples:
# Preserve context before conversation compacting
Use the dev-memory-manager agent to save our authentication implementation progress
# Retrieve previous session context
Use the dev-memory-manager agent to get our payment integration context from yesterday
# Capture a complete solution
Use the dev-memory-manager agent to store this debugging journey and solution
Context Preservation (Priority)
- Saves current work state, variables, file modifications
- Records decision history and alternatives considered
- Preserves debugging steps and current hypotheses
- Links to related conversations and commits
Knowledge Types Captured
session: Work-in-progress and conversation statesnippet: Reusable code blocks with clear utilitypattern: Architectural designs and best practicesinsight: Lessons learned and gotchasrunbook: Step-by-step proceduresjourney: Complete problem-solving narratives
Smart Storage Strategy
- Global scope: Universal patterns and personal optimizations
- Local scope: Project-specific work-in-progress
- Committed scope: Team standards and shared knowledge
- Session tags: Continuation markers and project phases
Pre-Compacting Preservation:
Long conversation about implementing OAuth → Context limit approaching → Agent automatically saves:
- Current implementation state
- Testing approach and results
- Next planned steps
- Links to related documentation
Session Continuity:
New conversation → "Continue payment integration work" → Agent retrieves:
- Previous session progress
- Code state and file modifications
- Current blockers and decisions made
- Relevant patterns and insights
Knowledge Evolution:
Debugging session → Solution found → Agent captures:
- Complete problem description
- All attempted solutions
- Final working solution with explanation
- Links to related issues and patterns
- Let the agent work proactively - It monitors context automatically
- Reference previous work clearly - Use project names and feature identifiers
- Confirm important captures - Review what the agent stores for critical work
- Use continuation markers - The agent tags work with
wip,blocked,next-session - Trust the retrieval - The agent knows what context you might be missing
The agent respects your LLM Memory MCP server configuration:
- Scope preferences: Set in your MCP server config
- Search tuning: Configurable per-scope ranking weights
- Storage layout: Follows your project's memory organization
No additional configuration needed - the agent adapts to your existing memory setup.
Automatically capture development knowledge from your git commits to build a searchable knowledge base of your coding patterns, solutions, and insights.
1. Tag commits with #kb to capture knowledge:
git commit -m "Implement JWT authentication with refresh tokens #kb #security"2. System automatically captures:
- Commit message and metadata
- Code changes (diff)
- Affected files and symbols
- Additional context tags
3. Process captured events:
{ "name": "autolearn.processQueue", "arguments": {} }4. Knowledge becomes searchable:
{
"name": "memory.query",
"arguments": {
"q": "JWT authentication",
"scope": "project",
"k": 10
}
}The auto-learning system consists of three integrated components:
1. Git Hooks (automatically installed)
commit-msg: Detects#kbtags in commit messagespost-commit: Captures commit details to queue file
2. Event Queue (.llm-memory/autolearn-queue.ndjson)
- Stores captured events until processed
- Survives server restarts
- Prevents data loss
3. Materialization (converts events to memories)
- Classifies commits by type (fix → insight, refactor → pattern)
- Extracts code snippets and context
- Creates searchable MemoryItems
Check System Status:
{ "name": "autolearn.status", "arguments": {} }Returns:
- Hook installation status
- Queue size
- System configuration
Initialize Auto-Learning:
{ "name": "autolearn.init", "arguments": { "autoInstall": true } }Installs git hooks and Claude Code hooks/agents.
Process Event Queue:
{ "name": "autolearn.processQueue", "arguments": {} }Processes all queued events and creates memories.
Capture Specific Commit:
{ "name": "autolearn.captureCommit", "arguments": { "commitHash": "HEAD" } }Manually capture a commit (useful for retroactive capture).
Install Globally:
{ "name": "autolearn.installGlobally", "arguments": {} }Install hooks and agents in your global Claude Code directory (~/.claude/).
Capture Bug Fix:
git commit -m "Fix race condition in authentication middleware #kb #bug #async"Creates an insight memory with:
- Title: "Fix race condition in authentication middleware"
- Tags: kb, bug, async, fix
- Code: Affected code from diff
- Files: Modified files
- Symbols: Extracted function/class names
Capture Pattern:
git commit -m "Refactor API client with retry logic pattern #kb #pattern #resilience"Creates a pattern memory documenting the retry pattern.
Capture Configuration:
git commit -m "Add ESLint config for TypeScript strict mode #kb #config #typescript"Creates a config memory with the configuration template.
When you connect the MCP server to Claude Code (or other MCP clients), the system automatically:
- Detects your project via git repository detection
- Initializes auto-learning with hook installation
- Logs status showing what was installed
- Ready to capture - just use
#kbin commits
No manual setup required! The system works out of the box.
The dev-memory-manager agent integrates with auto-learning to:
- Check for queued events on session start
- Process and present captured knowledge
- Suggest adding
#kbtags to important commits - Ensure hooks are installed and working
This creates a seamless workflow where you focus on coding and committing, and the system automatically builds your knowledge base.
Auto-learning respects the standard memory configuration system. Configure via project.config.set:
{
"name": "project.config.set",
"arguments": {
"scope": "local",
"config": {
"version": "1",
"autolearn": {
"enabled": true,
"captureTypes": ["commit", "fix", "refactor", "pattern"],
"gitHooks": {
"enabled": true,
"tagPattern": "#kb",
"captureDiffs": true,
"maxDiffSize": 10000
},
"filters": {
"minLinesChanged": 5,
"includePatterns": ["**/*.ts", "**/*.js"],
"excludePatterns": ["**/node_modules/**", "**/dist/**"]
},
"storage": {
"scope": "local",
"defaultType": "snippet"
}
}
}
}
}When to Use #kb Tags:
- ✅ Implementing new features or patterns
- ✅ Fixing complex bugs with reusable solutions
- ✅ Adding configurations or templates
- ✅ Refactoring with architectural insights
- ✅ Creating utilities or helper functions
When NOT to Use #kb Tags:
- ❌ Trivial changes (typos, formatting)
- ❌ WIP/temporary commits
- ❌ Merge commits or rebases
- ❌ Commits with sensitive information
Tagging Strategy:
# Include descriptive context tags
git commit -m "Add rate limiting middleware #kb #security #express #middleware"
# Use type indicators
git commit -m "Fix memory leak in WebSocket handler #kb #bug #websocket"
# Reference related systems
git commit -m "Refactor authentication flow #kb #pattern #auth #jwt"Hooks not triggering?
# Check hook installation
ls -la .git/hooks/ | grep -E '(commit-msg|post-commit)'
# Verify executable permissions
chmod +x .git/hooks/commit-msg .git/hooks/post-commit
# Check for marker file (created after #kb commit)
ls -la .git/llm-memory-autolearn.tmpQueue not processing?
# Check queue contents
cat .llm-memory/autolearn-queue.ndjson
# Check system status
echo '{"name":"autolearn.status","arguments":{}}' | node dist/index.js
# Manually process queue
echo '{"name":"autolearn.processQueue","arguments":{}}' | node dist/index.jsAgents not active?
# Check agent installation
ls -la ~/.claude/agents/ | grep dev-memory-manager
# Check project-level agents
ls -la .claude/agents/
# Restart Claude Code to reload agentsFor more detailed documentation, see docs/AUTO_LEARNING.md.
The LLM Memory MCP Server supports three storage backends, each optimized for different use cases:
Perfect for: Knowledge management, team wikis, Obsidian users, human-readable storage
Stores memories as individual markdown files with YAML frontmatter, fully compatible with Obsidian and other markdown tools. Each memory is a standalone .md file with:
- YAML frontmatter containing metadata (id, type, tags, confidence, etc.)
- Markdown body with title, description, and code blocks
- Wikilinks for linking related memories (
[[memory-id-title]]) - Context sections showing repository, file, and tool information
Benefits:
- ✅ Human-readable and editable in any text editor
- ✅ Full Obsidian integration with graph view, backlinks, and wikilinks
- ✅ Version control friendly (git diff works naturally)
- ✅ Easy to share, review, and collaborate on
- ✅ No external dependencies required
Storage Structure:
_LLM_memories/
react-project/ # Project-based subfolder (from repoId)
01ABC-react-hooks.md
01DEF-typescript-patterns.md
nodejs-api/ # Different project
01GHI-express-middleware.md
_global/ # Memories without specific project
01JKL-git-workflow.md
Memories are automatically organized by project using the repoId from their context. This makes it easy to:
- Navigate memories by project in Obsidian's file explorer
- Use Obsidian's folder-based features (tags, filters, views)
- Keep project knowledge isolated and organized
- Find related memories within the same project
Perfect for: Large codebases, storage-constrained environments, archival
Revolutionary video-based storage system that achieves 50-100x compression ratios while maintaining sub-100ms search performance. Uses QR code encoding combined with video compression to dramatically reduce storage requirements.
Content → QR Code Encoding → Video Frame → H.264/H.265 Compression → Ultra-Compact Storage
1KB → 2.4x comp → Frame → 50-80x total → ~20 bytes
Key Technologies:
- QR Code Pipeline: Text content encoded into QR codes with error correction
- Video Compression: QR frames stored as video using advanced codecs (H.264/H.265)
- Frame Indexing: Binary index (.mvi files) for instant frame location
- Content Deduplication: SHA-256 hash addressing prevents duplicate storage
- Intelligent Caching: Multi-tier cache system for frequently accessed content
Storage Efficiency by Content Type:
┌────────────────┬──────────────┬──────────────┬──────────────┐
│ Content Type │ Original │ Video (H264) │ Video (H265) │
├────────────────┼──────────────┼──────────────┼──────────────┤
│ Code Snippets │ 1x │ 47x │ 62x │
│ Documentation │ 1x │ 53x │ 71x │
│ JSON Config │ 1x │ 78x │ 94x │
│ Mixed Content │ 1x │ 51x │ 68x │
│ Average │ 1x │ 57x │ 74x │
└────────────────┴──────────────┴──────────────┴──────────────┘
Perfect for: Maximum compatibility, no dependencies, debugging
Traditional JSON-based storage with optimized journaling and content-based hashing. Each memory is stored as a separate JSON file with automatic journal compaction.
Benefits:
- ✅ No external dependencies
- ✅ Fast and reliable
- ✅ Optimized journal with 81-95% compression via SHA-256 hashing
- ✅ Works everywhere Node.js runs
Storage Structure:
items/
01ABC.json
01DEF.json
journal-optimized.ndjson
catalog.json
The system intelligently detects available storage capabilities and selects the optimal backend:
Detection Priority:
- Config Check - Explicit
storage.backendsetting in config.json - Markdown Detection - Presence of
memories/directory with.mdfiles - Video Detection - Presence of
segments/directory with video files - FFmpeg Check - Native FFmpeg availability (for video encoding)
- Fallback - File storage (always available)
FFmpeg Detection:
// Automatic detection on startup
if (await hasNativeFFmpeg()) {
useVideoStorage = true;
encoderType = 'native';
} else if (await hasWasmSupport()) {
useVideoStorage = true;
encoderType = 'wasm';
} else {
useVideoStorage = false;
encoderType = 'file';
}Search Performance (1M memory items):
┌────────────────┬─────────┬─────────┬─────────┬──────────┐
│ Operation │ P50 │ P95 │ P99 │ Max │
├────────────────┼─────────┼─────────┼─────────┼──────────┤
│ Video Decode │ 8ms │ 19ms │ 31ms │ 58ms │
│ Hybrid Search │ 23ms │ 54ms │ 86ms │ 167ms │
│ Context Pack │ 45ms │ 98ms │ 156ms │ 298ms │
└────────────────┴─────────┴─────────┴─────────┴──────────┘
Cache Performance:
- Payload Cache Hit Rate: 78-85%
- Frame Cache Hit Rate: 68-74%
- QR Decode Success Rate: 99.7%
Automatic Configuration: The system automatically selects the optimal storage backend and configures compression settings. No manual configuration required.
Manual Configuration (Advanced):
{
"storage": {
"backend": "video",
"videoOptions": {
"codec": "h264",
"crf": 26,
"preset": "medium",
"errorCorrection": "M"
}
}
}Configuration Options:
backend:"auto"(default),"file","video","markdown"codec:"h264"(default),"h265"(video only)crf: Quality setting (18-28, lower = higher quality) (video only)preset: Encoding speed ("fast","medium","slow") (video only)errorCorrection: QR error correction ("L","M","Q","H") (video only)
The system provides seamless migration between all three storage backends (file ↔ video ↔ markdown):
Check Migration Status:
{ "name": "mig.status", "arguments": { "scope": "local", "backend": "video" } }Migrate to Video Storage:
{ "name": "mig.storage.backend", "arguments": {
"sourceBackend": "file",
"targetBackend": "video",
"scope": "local",
"validateAfterMigration": true
}}Migration Features:
- Zero Downtime: Migrations occur in background
- Integrity Validation: Automatic verification after migration
- Rollback Capability: Restore to previous backend if needed
- Progress Tracking: Real-time migration status
FFmpeg Not Found:
# Verify FFmpeg installation
ffmpeg -version
# Check PATH configuration
which ffmpeg
# Test video encoding capability
echo '{"name": "maint.verify", "arguments": {"scope": "local"}}' | node dist/index.jsPerformance Issues:
- Slow Encoding: Install native FFmpeg instead of relying on WASM
- High Memory Usage: Reduce cache sizes in configuration
- Decode Failures: Check QR error correction settings
Storage Issues:
# Check storage backend status
echo '{"name": "mig.status", "arguments": {"scope": "local"}}' | node dist/index.js
# Validate video storage integrity
echo '{"name": "mig.validate", "arguments": {"scope": "local", "backend": "video"}}' | node dist/index.js
# Get detailed storage metrics
echo '{"name": "maint.verify", "arguments": {"scope": "all"}}' | node dist/index.jsDebug Mode:
# Enable debug logging
DEBUG="llm-memory:video" pnpm start
# Test with specific backend
LLM_MEMORY_FORCE_BACKEND=file pnpm start
LLM_MEMORY_FORCE_BACKEND=video pnpm start- global: personal memory across projects (
~/.llm-memory/global) - local: per-project (uncommitted) memory (
~/.llm-memory/projects/<repoId>) - committed: shared memory committed in repo (
<project>/.llm-memory)
File Storage Layout (Traditional):
<scope-root>/
items/ # one JSON per MemoryItem
index/
inverted.json # inverted index
lengths.json # document lengths
meta.json # index metadata
catalog.json # id -> MemoryItemSummary
jour.ndjson # legacy append-only change log (auto-migrated)
journal-optimized.ndjson # optimized journal with SHA-256 hashes (95% smaller)
locks/ # advisory lock files
tmp/ # atomic write staging
config.json # per-scope configuration
Video Storage Layout (Compressed):
<scope-root>/
segments/
consolidated.mp4 # video file containing QR-encoded content
consolidated-index.json # frame-to-content mapping
index/
inverted.json # BM25 search index
vec.bin # vector embeddings (optional)
meta.json # index metadata
catalog.json # id -> MemoryItemSummary with frame references
tmp/ # atomic write staging
config.json # per-scope configuration (includes storage backend)
snapshot-meta.json # integrity verification metadata
Markdown Storage Layout:
<scope-root>/
_LLM_memories/ # Root memories folder
project-a/ # Project-specific subfolders (based on repoId)
01ABC-component.md
01DEF-util.md
project-b/
01GHI-api.md
_global/ # Memories without specific project
01JKL-pattern.md
.memory/ # Hidden metadata directory
catalog.json # id → MemoryItemSummary
config.json # per-scope configuration
index/ # Search indexes
inverted.json
vectors.bin
meta.json
Storage Backend Auto-Selection:
- System automatically detects storage backend based on directory structure
config.jsoncontainsstorage.backendfield indicating active backend- Seamless migration between all three backends using migration tools
Initialize committed scope in current project:
{ "name": "proj.initCommitted", "arguments": {} }When using markdown storage backend, memories are fully compatible with Obsidian, enabling powerful knowledge management features:
- Enable markdown storage:
{ "name": "project.config.set", "arguments": {
"scope": "local",
"config": { "version": "1", "storage": { "backend": "markdown" } }
}}-
Open your memory folder in Obsidian:
- Global:
~/.llm-memory/global/_LLM_memories/ - Local:
~/.llm-memory/projects/<project-hash>/_LLM_memories/ - Committed:
<project>/.llm-memory/_LLM_memories/
Project Organization: Memories are automatically organized into subfolders based on their project (
repoId):_LLM_memories/react-app/- Memories from your React project_LLM_memories/api-server/- Memories from your API project_LLM_memories/_global/- Memories without a specific project
- Global:
-
Features you get:
- 📊 Graph View - Visualize connections between memories
- 🔗 Wikilinks - Click
[[memory-id-title]]to navigate - ⬅️ Backlinks - See which memories reference the current one
- 🔍 Full-text Search - Use Obsidian's powerful search
- 🏷️ Tags - Filter and organize with #tags
- ✍️ Edit Anywhere - Modify memories in Obsidian or your IDE
Each memory is a markdown file with YAML frontmatter:
---
id: 01JDF97ZMB000000000000001
type: pattern
scope: global
title: React Hooks Best Practices
language: typescript
tags: [react, hooks, best-practices]
confidence: 0.85
pinned: false
createdAt: 2025-10-11T13:10:00.000Z
updatedAt: 2025-10-11T13:10:00.000Z
version: 1
---
# React Hooks Best Practices
Essential patterns for using React Hooks effectively.
## Code
\`\`\`typescript
// Your code here
\`\`\`
## Related Memories
- [[01ABC-typescript-generics]]
- [[01DEF-react-performance]]
## Context
- **Tool**: Claude Code
- **Framework**: ReactFor programmatic access to your Obsidian vault, install the Local REST API plugin:
- Install the plugin in Obsidian
- Enable HTTPS in plugin settings
- Generate an API key
- Use the REST API to read/write memories programmatically
This enables powerful workflows like:
- Sync memories to Obsidian in real-time
- Create memories from Obsidian notes
- Automate knowledge capture from development sessions
- mem.upsert — Create/update items
- mem.get — Fetch by id
- mem.delete — Delete by id
- mem.list — List summaries (scope: global|local|committed|project|all)
- mem.query — Ranked search with filters and top-k
- mem.contextPack — IDE-ready context pack (see Context Packs below)
- mem.link — Link items (refines|duplicates|depends|fixes|relates)
- mem.pin / mem.unpin — Pin/unpin for ranking
- mem.tag — Add/remove tags
- mem.feedback — Record helpful/not helpful feedback for confidence scoring
- mem.use — Record usage/access events for confidence scoring
- mem.patch — Apply surgical text replacements without full rewrite
- mem.append — Add content to existing memories incrementally
- mem.merge — Combine multiple memories intelligently with deduplication
- mem.renew — Extend TTL for valuable memories
- vec.set — Set/update an item embedding (for hybrid search)
- vec.remove — Remove an item embedding
- vec.importBulk — Bulk import vectors (same dimension enforced)
- vec.importJsonl — Bulk import vectors from JSONL file; optional dim override
- proj.info — Project root, repoId, committed status
- proj.initCommitted — Create
.llm-memoryin repo - proj.config.get — Read
config.jsonfor a scope - proj.config.set — Write
config.jsonfor a scope - proj.sync.status — Check local vs committed memory differences
- proj.sync.merge — Merge local memories to committed scope
- maint.rebuild — Rebuild catalog/index from items on disk
- maint.replay — Replay journal; optional compaction
- maint.compact — Compact journal
- maint.compact.now — Trigger immediate compaction
- maint.compactSnapshot — One-click compaction + snapshot
- maint.snapshot — Write snapshot meta (lastTs + checksum)
- maint.verify — Verify current checksum vs snapshot and state-ok markers
- maint.prune — Remove expired memories based on TTL (with dry-run option)
- jour.stats — Get journal statistics and optimization status
- jour.migrate — Migrate legacy journal to optimized format
- jour.verify — Verify integrity using optimized journal hashes
- mig.status — Check migration status and storage metrics
- mig.storage.backend — Migrate between file and video storage backends
- mig.scope — Migrate filtered memories between scopes (global/local/committed)
- mig.validate — Validate migration integrity and consistency
- check-memory — Auto-discover relevant memories before starting tasks (inspired by Claude's memory tool)
Resources
- kb://project/info — Project info + recent items
- kb://health — Minimal health/status
- kb://context/pack — Build a context pack; supports URI query args
Key fields (see src/types/Memory.ts):
- type: snippet | pattern | config | insight | runbook | fact | note
- scope: global | local | committed
- title, text, code, language
- facets: tags[], files[], symbols[]
- context: repoId, branch, commit, file, range, tool, etc.
- quality: confidence, reuseCount, pinned, ttlDays, helpfulCount, notHelpfulCount, decayedUsage, lastAccessedAt, lastUsedAt, lastFeedbackAt
- security: sensitivity (public/team/private), secretHashRefs
The quality.confidence field (0-1) is automatically calculated using:
- Feedback signals: User helpful/not helpful votes with Bayesian smoothing
- Usage patterns: Access frequency with exponential decay (14-day half-life)
- Recency: Time since last access with decay (7-day half-life)
- Context matching: Relevance to current project/query context
- Base prior: Starting confidence for new items (default 0.5)
Confidence scores directly influence search ranking, with higher confidence items receiving boost multipliers.
Recommended usage for JS/TS projects:
- Use
type: 'snippet', setlanguage: 'typescript'or'javascript'. - Attach
filesandsymbolsfor better retrieval. - Use
patternfor recurring designs;configfor templates;insight/factfor distilled learnings. - Pin high-value items; store team standards in committed scope.
Create a snippet (local scope):
{
"name": "mem.upsert",
"arguments": {
"type": "snippet",
"scope": "local",
"title": "React Error Boundary",
"language": "typescript",
"code": "class ErrorBoundary extends React.Component { /* ... */ }",
"tags": ["react", "error-handling"],
"files": ["src/components/ErrorBoundary.tsx"],
"symbols": ["ErrorBoundary"]
}
}Query snippets/patterns for React:
{
"name": "mem.query",
"arguments": {
"q": "react",
"scope": "project",
"k": 10,
"filters": { "type": ["snippet", "pattern"] }
}
}Pin an important pattern:
{ "name": "mem.pin", "arguments": { "id": "01H..." } }Link related items:
{ "name": "mem.link", "arguments": { "from": "01A...", "to": "01B...", "rel": "refines" } }Record positive feedback for confidence scoring:
{ "name": "mem.feedback", "arguments": { "id": "01H...", "helpful": true, "scope": "local" } }Record usage event for confidence scoring:
{ "name": "mem.use", "arguments": { "id": "01H...", "scope": "local" } }Check storage backend and migration status:
{ "name": "mig.status", "arguments": { "scope": "local", "backend": "video" } }Migrate from file to markdown storage (Obsidian-compatible):
{ "name": "migration.storage.backend", "arguments": {
"sourceBackend": "file",
"targetBackend": "markdown",
"scope": "local",
"validateAfterMigration": true
}}Migrate from file to video storage (ultra-compressed):
{ "name": "mig.storage.backend", "arguments": {
"sourceBackend": "file",
"targetBackend": "video",
"scope": "local",
"validateAfterMigration": true
}}Migrate from markdown to video storage:
{ "name": "mig.validate", "arguments": { "scope": "local", "backend": "video" } }Rebuild catalog and index for project scopes:
{ "name": "maint.rebuild", "arguments": { "scope": "project" } }Claude can now proactively check for relevant memories before starting tasks:
// Claude invokes the check-memory prompt
{
"name": "check-memory",
"arguments": {
"task": "Implement JWT token rotation",
"files": "src/auth/jwt.ts, src/middleware/auth.ts",
"context": "feature/auth-improvements"
}
}Returns formatted markdown with relevant memories, code snippets, and confidence scores to help Claude discover existing knowledge patterns automatically.
Edit memories without full rewrites, inspired by Claude's str_replace and insert commands:
Fix a typo:
{ "name": "mem.patch", "arguments": {
"id": "01HX...",
"operations": [
{ "field": "text", "old": "authetication", "new": "authentication" }
]
}}Add new learnings:
{ "name": "mem.append", "arguments": {
"id": "01HX...",
"field": "text",
"content": "Update: Also works with OAuth2 flows",
"separator": "\n\n"
}}Combine duplicate memories:
{ "name": "mem.merge", "arguments": {
"sourceIds": ["01HX...", "01HY...", "01HZ..."],
"scope": "local",
"strategy": "deduplicate",
"deleteSource": true
}}Merge strategies:
concat— Simple concatenationdeduplicate— Remove duplicate lines (default)prioritize-first— Keep first item's contentprioritize-recent— Use most recently updated content
Video Storage Compatibility: All incremental operations work seamlessly with video storage through a read-modify-write pattern. The system reads the item (decodes frame), modifies it in memory, then writes back via upsert (creates new frame). Old frames are preserved for history/recovery.
Automatically manage memory lifecycle with time-to-live settings:
Create temporary memory:
{ "name": "mem.upsert", "arguments": {
"type": "insight",
"scope": "local",
"text": "Debugging auth flow - using test token ABC123",
"quality": { "ttlDays": 7 }
}}Preview expired memories:
{ "name": "maint.prune", "arguments": {
"scope": "local",
"dryRun": true
}}Remove expired memories:
{ "name": "maint.prune", "arguments": {
"scope": "local",
"dryRun": false
}}Extend TTL for valuable memories:
{ "name": "mem.renew", "arguments": {
"id": "01HX...",
"ttlDays": 90
}}Common TTL patterns:
- Debugging context: 7 days
- Sprint notes: 14 days
- Experimental patterns: 30 days
- Valuable insights: 90-365 days
Video Storage: Pruning removes catalog entries while preserving video frames for potential recovery.
Search uses BM25 with configurable boosts. Tune per scope via config.json and proj.config.*.
Config (subset):
interface MemoryConfig {
version: string;
ranking?: {
fieldWeights?: { title?: number; text?: number; code?: number; tag?: number };
bm25?: { k1?: number; b?: number };
scopeBonus?: { global?: number; local?: number; committed?: number };
pinBonus?: number;
recency?: { halfLifeDays?: number; scale?: number };
phrase?: { bonus?: number; exactTitleBonus?: number };
hybrid?: { enabled?: boolean; wBM25?: number; wVec?: number; model?: string };
};
contextPack?: {
order?: Array<'snippets'|'facts'|'patterns'|'configs'>;
caps?: { snippets?: number; facts?: number; patterns?: number; configs?: number };
};
maintenance?: {
compactEvery?: number; // compact after N journal appends (default: 500)
compactIntervalMs?: number; // time-based compaction (default: 24h)
snapshotIntervalMs?: number; // time-based snapshot (default: 24h)
indexFlush?: { maxOps?: number; maxMs?: number }; // index scheduler flush thresholds
};
}Recommended defaults (JS/TS):
- fieldWeights: title=5, text=2, code=1.5, tag=3
- bm25: k1=1.5, b=0.75
- scopeBonus: committed=1.5, local=1.0, global=0.5
- pinBonus: 2
- recency: halfLifeDays=14, scale=2
- phrase: bonus=2.5, exactTitleBonus=6
Set committed-scope tuning:
{
"name": "proj.config.set",
"arguments": {
"scope": "committed",
"config": {
"version": "1",
"ranking": {
"fieldWeights": { "title": 6, "text": 2, "code": 1.2, "tag": 3 },
"bm25": { "k1": 1.4, "b": 0.7 },
"scopeBonus": { "committed": 2.0, "local": 1.0, "global": 0.3 },
"pinBonus": 3,
"recency": { "halfLifeDays": 7, "scale": 2.5 },
"phrase": { "bonus": 3, "exactTitleBonus": 8 },
"hybrid": { "enabled": true, "wBM25": 0.7, "wVec": 0.3, "model": "local-emb" }
}
}
}
}After changing field weights, run maint.rebuild for the affected scope to re-apply indexing weights.
The confidence scoring algorithm can be tuned via the confidence section in config.json:
interface ConfidenceConfig {
// Bayesian prior for helpfulness (Laplace smoothing)
priorAlpha?: number; // default: 1
priorBeta?: number; // default: 1
basePrior?: number; // default: 0.5
// Time-based decay
usageHalfLifeDays?: number; // default: 14
recencyHalfLifeDays?: number; // default: 7
// Usage saturation
usageSaturationK?: number; // default: 5
// Weights for linear blend
weights?: {
feedback?: number; // default: 0.35
usage?: number; // default: 0.25
recency?: number; // default: 0.20
context?: number; // default: 0.15
base?: number; // default: 0.05
};
// Pinned behavior
pin?: {
floor?: number; // default: 0.8
multiplier?: number; // default: 1.05
};
}Example configuration:
{
"name": "proj.config.set",
"arguments": {
"scope": "committed",
"config": {
"version": "1",
"confidence": {
"usageHalfLifeDays": 21,
"recencyHalfLifeDays": 10,
"weights": {
"feedback": 0.4,
"usage": 0.3,
"recency": 0.2,
"context": 0.1
}
}
}
}
}The system includes local embedding generation using transformers.js and HNSW vector indexing for high-performance semantic search. This enables:
- Offline embedding generation - No external API calls or network dependencies
- HNSW (Hierarchical Navigable Small World) - O(log n) search complexity vs O(n) linear scan
- Hybrid search - Combine keyword-based BM25 with semantic vector similarity
- Multiple embedding models - Choose based on your needs (speed vs quality vs dimensions)
- Auto-embedding - Automatic vector generation on memory upsert (configurable)
Three pre-configured models, all running locally via transformers.js:
| Model | Dimensions | Max Tokens | Best For | Speed |
|---|---|---|---|---|
| bge-small-en-v1.5 (default) | 384 | 512 | Code and technical documentation | ⚡⚡ |
| all-MiniLM-L6-v2 | 384 | 256 | General text, fast inference | ⚡⚡⚡ |
| all-mpnet-base-v2 | 768 | 384 | Higher quality semantic matching | ⚡⚡ |
First run downloads the model (~25-90MB depending on model), then cached locally in .cache/transformers/.
1. Enable embeddings in configuration:
{ "name": "proj.config.set", "arguments": { "scope": "committed", "config": { "version": "1", "ranking": { "hybrid": { "enabled": true, "wBM25": 0.7, "wVec": 0.3, "model": "local-emb" } } } } }2. Create a memory (auto-embedding happens in background):
{ "name": "vec.set", "arguments": { "scope": "local", "id": "01ABC...", "vector": [0.1, -0.2, 0.05, ...] } }3. Manually embed specific memories:
{ "name": "mem.query", "arguments": { "q": "authentication flow", "scope": "project", "k": 20, "vector": [/* query embedding */], "filters": { "type": ["snippet", "pattern"] } } }4. Batch embed multiple memories:
{ "name": "vec.importJsonl", "arguments": { "scope": "local", "path": "/abs/path/vec.jsonl", "dim": 768 } }5. Generate embedding for raw text:
{ "name": "vec.importBulk", "arguments": { "scope": "local", "items": [{"id":"01A","vector":[0.1,0.2]},{"id":"01B","vector":[0.0,0.3]}] } }Build an IDE-ready pack of code snippets, facts, configs, and patterns, tuned for JS/TS:
- Tool:
mem.contextPack - Resource:
kb://context/pack - Useful args:
- q, scope, k
- filters (types/tags/language/files)
- snippetWindow { before, after }
- snippetLanguages: ["typescript","tsx","javascript"]
- snippetFilePatterns: ["src//*.ts","src//*.tsx"]
- tokenBudget (approx tokens; ~4 chars/token heuristic) or maxChars
Example:
{ "name": "mem.contextPack", "arguments": { "q": "react hooks", "scope": "project", "k": 12, "tokenBudget": 2000, "snippetLanguages": ["typescript","tsx"], "snippetFilePatterns": ["src/**/*.ts","src/**/*.tsx"] } }URI form:
kb://context/pack?q=react%20hooks&scope=project&k=12&tokenBudget=2000&snippetLanguages=typescript,tsx&snippetFilePatterns=src/**/*.ts,src/**/*.tsx
Per-scope order/caps are configurable in config.json under contextPack.
- Threshold-based compaction: set
maint.compactEvery(default 500). Triggers compaction after N journal appends. - Time-based compaction: set
maint.compactIntervalMs(default 24h). - Manual controls:
maint.replay— replay journal; optional compactmaint.compact— compact scope(s)maint.compact.now— immediate compactionmaint.compactSnapshot— compaction + snapshot in one stepmaint.snapshot— write snapshot meta (for fast tail replay)maint.verify— recompute checksum and compare to snapshot/state-ok
State-ok markers
- After successful compaction and startup tail replay, the server writes
index/state-ok.jsoncontaining the last verified checksum and timestamp. maint.verifyreports whether current checksum matches both snapshot and state-ok markers.
On upsert, common credential patterns are redacted from text/code and hashed into security.secretHashRefs to prevent leakage into committed mem.
pnpm install
pnpm run dev
pnpm run build
pnpm run typecheck
pnpm run lint
pnpm run test:all # end-to-end tool tests
pnpm run simulate:user # simulated JS/TS flow-
Recommended env for tests/simulation
- Use project-local storage and skip startup replay for snappy runs:
LLM_MEMORY_HOME_DIR="$(pwd)" LLM_MEMORY_SKIP_STARTUP_REPLAY=1 pnpm run test:allLLM_MEMORY_HOME_DIR="$(pwd)" LLM_MEMORY_SKIP_STARTUP_REPLAY=1 pnpm run simulate:user
- Alternatively delay replay instead of disabling:
LLM_MEMORY_STARTUP_REPLAY_MS=2000 pnpm run test:all
- Use project-local storage and skip startup replay for snappy runs:
-
Vector store dimension issues
- Bulk imports enforce a single embedding dimension. If you previously stored a different dimension, either:
- Pass a
dimoverride tovec.importBulk/vec.importJsonl, or - Clean the local vector files and re-import:
rm -f .llm-memory/index/vec.json .llm-memory/index/vec.meta.json
- Pass a
- Bulk imports enforce a single embedding dimension. If you previously stored a different dimension, either:
-
Snapshot/verify workflow
- For fast restarts, run once:
maint.compactSnapshot(project/all), thenmaint.verifyshould report ok=true. - Verify compares the current checksum against both snapshot and the last
state-okmarker.
- For fast restarts, run once:
-
Zsh glob “no matches found”
- Use
rm -fto ignore missing files, or enable NULL_GLOB temporarily:setopt NULL_GLOB.
- Use
-
“MODULE_TYPELESS_PACKAGE_JSON” warning
- Optional: add
"type": "module"to package.json or run Node with--input-type=moduleto silence the warning.
- Optional: add
Manual test:
node test-memory-tools.js— exercises mem.* tools via stdio
- The previous kb.* tools were replaced by mem.* tools.
- Offline-first; no external services required.
- For teams, prefer committed scope and stricter committed config.
- Save a reusable TypeScript pattern to committed scope
{ "name": "mem.upsert", "arguments": {
"type": "pattern",
"scope": "committed",
"title": "React Error Boundary",
"language": "typescript",
"text": "Wrap subtree with an error boundary component; log and render fallback UI.",
"code": "class ErrorBoundary extends React.Component { /* ... */ }",
"tags": ["react","error-handling","ts"],
"files": ["src/components/ErrorBoundary.tsx"],
"symbols": ["ErrorBoundary"]
} }- Search by tag across project (local + committed)
{ "name": "mem.query", "arguments": {
"scope": "project",
"k": 20,
"filters": { "tags": ["react"] }
} }- Build a context pack focused on src/utils and TS/TSX
{ "name": "mem.contextPack", "arguments": {
"q": "debounce util",
"scope": "project",
"k": 12,
"tokenBudget": 1800,
"snippetLanguages": ["typescript","tsx"],
"snippetFilePatterns": ["src/utils/**/*.ts","src/utils/**/*.tsx"]
} }- Pin a frequently used runbook
{ "name": "mem.pin", "arguments": { "id": "01H..." } }- Merge local → committed (team share) and check status
{ "name": "proj.sync.status", "arguments": {} }{ "name": "proj.sync.merge", "arguments": {} }- Guard committed scope by sensitivity (team only)
{ "name": "proj.config.set", "arguments": {
"scope": "committed",
"config": { "version": "1", "sharing": { "enabled": true, "sensitivity": "team" } }
} }- Enable hybrid search and set vectors (example)
{ "name": "proj.config.set", "arguments": {
"scope": "local",
"config": { "version": "1", "ranking": { "hybrid": { "enabled": true, "wBM25": 0.7, "wVec": 0.3 } } }
} }{ "name": "vec.set", "arguments": { "scope": "local", "id": "01ABC...", "vector": [0.1, -0.2, 0.05] } }{ "name": "mem.query", "arguments": { "q": "auth flow", "scope": "project", "k": 20, "vector": [0.08, -0.15, 0.02] } }- Compact journals when needed
{ "name": "maint.compact.now", "arguments": { "scope": "project" } }- One-click compact + snapshot
{ "name": "maint.compactSnapshot", "arguments": { "scope": "all" } }- Verify on-disk state vs snapshot/state-ok
{ "name": "maint.verify", "arguments": { "scope": "project" } }The system automatically uses an optimized journal format that reduces storage by 81-95% through content-based hashing:
- Check journal optimization status
{ "name": "jour.stats", "arguments": { "scope": "all" } }- Manually migrate legacy journals (automatic on startup)
{ "name": "jour.migrate", "arguments": { "scope": "project" } }- Verify journal integrity using hashes
{ "name": "jour.verify", "arguments": { "scope": "local" } }The confidence scoring system automatically learns from your usage patterns and feedback to improve search relevance over time:
- Automatic tracking: Every time you access a memory item, its usage count increases
- Feedback loops: Mark items as helpful/not helpful to train the scoring algorithm
- Time decay: Unused items gradually lose confidence to keep results fresh
- Context awareness: Items are ranked higher when they match your current project context
Example workflow:
// Create a useful code snippet
{ "name": "mem.upsert", "arguments": {
"type": "snippet",
"scope": "local",
"title": "React useDebounce Hook",
"code": "const useDebounce = (value, delay) => { /* implementation */ }",
"language": "typescript",
"tags": ["react", "hooks", "performance"]
}}
// Record usage when you actually use it
{ "name": "mem.use", "arguments": { "id": "01ABC...", "scope": "local" } }
// Provide feedback when it proves helpful
{ "name": "mem.feedback", "arguments": { "id": "01ABC...", "helpful": true, "scope": "local" } }
// Search will now rank this item higher in future queries
{ "name": "mem.query", "arguments": { "q": "react debounce", "scope": "project", "k": 10 } }