MemPalace in the Wild: Ancient Memory Meets Modern FinTech Engineering
A critical review of the viral open-source AI memory system — and how frontend engineers building real-time trading dashboards, ETF platforms, and FinOps tooling can actually deploy it today.
Roman Kozak14 min read·1 hour ago--
I’ve spent over a decade building enterprise FinTech frontends. One thing is consistent across all of them: context is everything, and context is perishable.
Every AI copilot I’ve wired into a trading dashboard suffers the same problem. New session, new AI, no memory.
The risk limits you discussed yesterday? Gone.
The anomaly you flagged last week? Forgotten.
This is why an open-source project that claims to solve AI memory — and went viral overnight — deserves a serious FinTech-lens review.
Chapter 1: What Is MemPalace? The Viral Story
On April 5, 2026, a GitHub repository appeared under the handle milla-jovovich/mempalace. Within 48 hours it had accumulated over 23,000 stars and nearly 3,000 forks, making it the #1 trending repo on GitHub.
The co-author was Milla Jovovich — yes, that Milla Jovovich. The actress spent months working on an unnamed gaming project and ran headlong into the wall every power AI user hits: context-window amnesia. New session, new AI, no memory.
She teamed up with Ben Sigman (CEO of Bitcoin Libre, a Bitcoin lending startup) to engineer the software. Jovovich designed the concept and architecture; Sigman handled the implementation, with the pair vibe-coding extensively using Claude Code. The result: a free, local, MIT-licensed AI memory system that achieves 96.6% recall on LongMemEval — higher than paid cloud alternatives.
The Core Insight — Ancient Greek orators memorized entire speeches by mentally placing ideas inside rooms of an imaginary building. Walk the building, find the idea. MemPalace applies this method of loci to AI memory: conversations become a navigable spatial hierarchy, not a flat search index.
Chapter 2: Architecture Deep Dive: Wings, Rooms & Temporal Graphs
Memory Palace Hierarchy
This is what separates MemPalace from every other vector-dump approach. The architecture is a 6-level spatial hierarchy backed by two storage engines: ChromaDB for semantic vector search and SQLite for a temporal knowledge graph.
4-Layer Retrieval Stack
On startup, only L0 (critical entity facts) and L1 (recent events) are loaded — just 170 tokens. Deeper layers fire only when needed:
- L0 — Pinned entity facts (team, preferences, active projects)
- L1 — Recent events from the last N sessions
- L2 — Semantic similarity search via ChromaDB
- L3 — Optional Haiku reranking (adds ~$0.70/year)
# mempalace.yaml — FinTech project config (conceptual)
palace:
name: "northern-trust-etf"
wings:
- name: "etf-dashboard"
rooms: ["components", "api", "realtime", "auth", "deployment"]
- name: "fx-trading"
rooms: ["signalr", "pricing", "risk", "blotter"]
- name: "team"
rooms: ["decisions", "architecture", "reviews"]
retrieval:
startup_layers: ["L0", "L1"] # 170 tokens at wake-up
semantic_top_k: 8
rerank_model: "claude-haiku-4-5-20251001" # optional $0.70/yr
storage:
mode: "raw_verbatim" # keeps every word, no lossy summarization
chunk_size: 800 # chars, paragraph-boundary aware
overlap: 100
knowledge_graph:
temporal: true
# Note: contradiction_detection exists as fact_checker.py
# but is not yet wired into KG operations (as of April 2026)The Temporal Knowledge Graph
The SQLite knowledge graph is what makes MemPalace genuinely interesting for financial software. Every RDF-style triple carries valid_from and valid_to timestamps — so you can ask "what was the approved risk limit on January 15th?" and get a factually accurate, time-scoped answer.
This directly maps to the audit trail requirements of regulated financial environments. However, note that the current implementation does flat triple lookups — multi-hop graph traversal is not yet supported (tracked in Issue #27).
Chapter 3: Honest Benchmark Review: Hype vs. Reality
Let’s be direct. The community caught real problems within hours of launch, and the authors addressed them publicly in a README correction on April 7. Here’s what the numbers actually mean:
⚠️ Benchmark Caveats — Read These Before You Cite Any Numbers
🔹 The 96.6% raw score measures ChromaDB’s default embedding model on verbatim text, not MemPalace’s spatial structure specifically. The palace hierarchy (wings/rooms/halls) is not involved in this benchmark. This is the single most important fact about these numbers.
🔹 An independent reproduction on M2 Ultra (GitHub Issue #39) confirmed that enabling palace features actually degrades retrieval performance slightly. ChromaDB is doing the heavy lifting.
🔹 The 100% hybrid score was achieved by identifying 3 failing questions, engineering targeted fixes for those specific questions, and retesting on the same set. The held-out score is 98.4%. This is textbook overfitting.
🔹 AAAK compression at small scale increases token count (73 vs 66 tokens). AAAK mode scores only 84.2% on LongMemEval vs 96.6% raw.
🔹 The “+34% retrieval boost” from structure (60.9% → 94.8%) is real but is standard ChromaDB metadata filtering — not a novel architectural contribution.
🔹 Issue #27 documents multiple cases where README claims don’t match the codebase: halls aren’t used in retrieval ranking, no write gating, no input sanitization (prompt injection surface).
Retrieval Mode Performance Breakdown
This chart reveals the key insight: the 96.6% headline score comes from ChromaDB’s raw verbatim mode — the palace structure contributes a +34% improvement over flat search (standard metadata filtering), but the actual 96.6% benchmark doesn’t use the palace at all.
Annual Cost Comparison (Daily Usage, 6-Month Horizon)
FinTech-Lens Comparative Analysis
MemPalace dominates on data sovereignty, cost, and temporal query support — the axes that matter most in regulated FinTech. Mem0 and Zep win decisively on production maturity, which is expected for funded, multi-year projects vs. a one-week-old repo.
Chapter 4: Why FinTech Engineers Should Care
Despite the benchmark caveats, MemPalace addresses real problems in FinTech AI tooling.
Dashboard AI Copilots
Trading dashboards need AI that remembers your risk parameters, preferred chart intervals, and which anomalies you’ve already reviewed — across days of sessions. Without persistent memory, every AI interaction starts from zero.
Temporal Audit Trails
Financial compliance requires knowing who decided what and when. MemPalace’s temporal knowledge graph models this natively — every fact has a validity window. “What was the approved EM exposure limit on March 15th?” becomes a single query.
Data Sovereignty
Regulated environments can’t send client data to external APIs. MemPalace runs 100% locally — no API calls, no data egress. This removes the data sovereignty objection.
Note: local deployment still requires your organization’s standard software security review, change management, and operational risk assessment — “local” doesn’t mean “no compliance process.”
Multi-Agent FinOps
Multi-agent pipelines analyzing expense reports, generating compliance reports, or monitoring real-time risk need shared, persistent context. MemPalace becomes the shared memory bus your agents read from and write to — making every run smarter than the last.
Chapter 5: Practical Integration: Wiring MemPalace to Your FX Dashboard
⚠️ Important Note: The code below is a conceptual integration pattern. MemPalace’s Python API surface is evolving rapidly (it’s one week old). Verify import paths and method signatures against the current
mempalace/package on GitHub before using in production. The MCP server is currently stdio-based; the REST wrapper shown here is a custom addition.
Project Setup
# 1. Install MemPalace
pip install mempalace
# 2. Initialize palace for your project directory
mempalace init ~/projects/fx-trading
# 3. Mine your existing React/TS codebase into the palace
mempalace mine ./src --mode project
# 4. Mine Slack exports (architecture decisions, PR discussions)
mempalace mine ~/exports/slack-fintech.zip --mode convos
# 5. Mine your Claude Code sessions
mempalace mine ~/.claude/projects/fx-trading/ --mode convos
# 6. Search test
mempalace search "SignalR reconnect strategy"After mining, MemPalace ingests every component, hook, utility, and decision log — organized automatically into rooms based on keyword scoring.
Now wire up the MCP server so your AI tooling (Claude Code, Cursor) auto-queries the palace:
// .claude/mcp_settings.json — MCP integration
// Note: Known stdout bug (Issue #225) affects Claude Desktop.
// Claude Code works correctly as of April 2026.
{
"mcpServers": {
"mempalace": {
"command": "python",
"args": ["-m", "mempalace.mcp_server"],
"env": {
"PALACE_ROOT": "/path/to/your/palace"
}
}
}
}Once connected, Claude Code automatically gains access to 19 MCP tools: mempalace_search, mempalace_store, mempalace_kg_query, and more.
React Hook — useMemPalace for Trading Copilot
To use MemPalace from a React frontend, you’ll need a thin backend proxy that bridges stdio-based MCP to HTTP. The hook below assumes you’ve set up such a proxy (e.g., Express + @modelcontextprotocol/sdk).
// hooks/useMemPalace.ts
import { useCallback, useRef } from 'react';
// Your backend proxy that wraps the MCP server
const PALACE_API = '/api/mempalace';
export interface MemoryResult {
closet: string;
content: string;
room: string;
wing: string;
score: number;
timestamp: string;
}
export interface Triple {
subject: string;
predicate: string;
object: string;
valid_from: string;
valid_to?: string;
}
export interface UseMemPalaceReturn {
search: (query: string, wing?: string) => Promise<MemoryResult[]>;
store: (content: string, wing: string, room: string) => Promise<void>;
kgQuery: (entity: string, asOf?: string) => Promise<Triple[]>;
}
const MAX_CACHE_SIZE = 100;
export function useMemPalace(): UseMemPalaceReturn {
const cache = useRef<Map<string, MemoryResult[]>>(new Map());
const search = useCallback(
async (query: string, wing = 'fx-trading'): Promise<MemoryResult[]> => {
const key = `${wing}:${query}`;
if (cache.current.has(key)) return cache.current.get(key)!;
try {
const res = await fetch(`${PALACE_API}/search`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ query, wing, top_k: 6 }),
});
if (!res.ok) throw new Error(`Palace search failed: ${res.status}`);
const { results } = await res.json();
// Evict oldest entries if cache exceeds max size
if (cache.current.size >= MAX_CACHE_SIZE) {
const firstKey = cache.current.keys().next().value;
if (firstKey) cache.current.delete(firstKey);
}
cache.current.set(key, results);
return results;
} catch (error) {
console.error('[useMemPalace] search error:', error);
return [];
}
}, []
);
const store = useCallback(
async (content: string, wing: string, room: string): Promise<void> => {
try {
const res = await fetch(`${PALACE_API}/store`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ content, wing, room }),
});
if (!res.ok) throw new Error(`Palace store failed: ${res.status}`);
// Invalidate cache for this wing on new store
[...cache.current.keys()]
.filter(k => k.startsWith(wing))
.forEach(k => cache.current.delete(k));
} catch (error) {
console.error('[useMemPalace] store error:', error);
}
}, []
);
const kgQuery = useCallback(
async (entity: string, asOf?: string): Promise<Triple[]> => {
try {
const res = await fetch(`${PALACE_API}/kg/query`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ entity, as_of: asOf }),
});
if (!res.ok) throw new Error(`Palace KG query failed: ${res.status}`);
const { triples } = await res.json();
return triples;
} catch (error) {
console.error('[useMemPalace] kgQuery error:', error);
return [];
}
}, []
);
return { search, store, kgQuery };
}FX Copilot Panel
// components/FxCopilot/FxCopilotPanel.tsx
import { useState } from 'react';
import { useMemPalace } from '../../hooks/useMemPalace';
interface CopilotPanelProps {
activePair: string; // e.g. "EUR/USD"
currentRate: number;
}
export function FxCopilotPanel({ activePair, currentRate }: CopilotPanelProps) {
const [query, setQuery] = useState('');
const [answer, setAnswer] = useState<string>('');
const [loading, setLoading] = useState(false);
const [error, setError] = useState<string | null>(null);
const { search, store } = useMemPalace();
const handleQuery = async (userQuestion: string) => {
setLoading(true);
setError(null);
try {
// 1. Retrieve relevant memories from the palace
const memories = await search(
`${activePair} ${userQuestion}`,
'fx-trading'
);
// 2. Build a memory-enriched system prompt
const memContext = memories
.slice(0, 4)
.map(m => `[${m.wing}/${m.room}] ${m.content}`)
.join('\n');
const systemPrompt = `You are a senior FX trading assistant with deep context
about this team's systems and past decisions.
MEMORY PALACE CONTEXT (retrieved for this query):
${memContext}
Current state: ${activePair} @ ${currentRate}
Today: ${new Date().toISOString().split('T')[0]}
Answer concisely, referencing the memory context when relevant.`;
// 3. Call your LLM backend
const response = await fetch('/api/ai/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ system: systemPrompt, message: userQuestion }),
});
if (!response.ok) throw new Error('AI chat request failed');
const { reply } = await response.json();
// 4. Store this exchange in the palace for future sessions
await store(
`Q: ${userQuestion} A: ${reply} [${activePair} @ ${currentRate}]`,
'fx-trading',
'decisions'
);
setAnswer(reply);
} catch (err) {
setError(err instanceof Error ? err.message : 'Something went wrong');
} finally {
setLoading(false);
}
};
return (
<div className="fx-copilot-panel">
<input
value={query}
onChange={e => setQuery(e.target.value)}
onKeyDown={e => e.key === 'Enter' && handleQuery(query)}
placeholder={`Ask about ${activePair}...`}
/>
{loading && <span>Searching palace...</span>}
{error && <span className="error">{error}</span>}
{answer && <p>{answer}</p>}
</div>
);
}Chapter 6: Temporal Knowledge Graph for Portfolio State
Leveraging the temporal knowledge graph for compliance-critical use cases. In ETF portfolio management, you need to answer: “What was the approved emerging markets exposure limit on March 15th?”
# scripts/etf_kg_seed.py — Seed portfolio facts (conceptual API)
# Verify import paths against the current mempalace package
from mempalace.knowledge_graph import KnowledgeGraph
import json
from datetime import date
kg = KnowledgeGraph()
# ── Team & Responsibility Assignments ──────────────────
kg.add_triple("Sarah Chen", "owns", "EM-exposure-limit", valid_from="2025-01-10")
kg.add_triple("Dev Team", "manages", "etf-dashboard", valid_from="2024-06-01")
kg.add_triple("Alex Rivera", "deployed", "v2.4.1", valid_from="2026-02-15")
# ── Risk Limit History (temporal!) ─────────────────────
# Original limit approved Q1 2025
kg.add_triple("EM-exposure-limit", "approved_at", "12%", valid_from="2025-01-10")
# Board revised downward March 1, 2026 due to volatility
kg.invalidate("EM-exposure-limit", "approved_at", "12%", ended="2026-03-01")
kg.add_triple("EM-exposure-limit", "approved_at", "8%", valid_from="2026-03-01")
# ── Compliance queries ──────────────────────────────────
# Current state
print(kg.query_entity("EM-exposure-limit"))
# → [EM-exposure-limit → approved_at → 8% (current, from 2026-03-01)]
# Historical query — what was true in January 2026?
print(kg.query_entity("EM-exposure-limit", as_of="2026-01-20"))
# → [EM-exposure-limit → approved_at → 12% (valid 2025-01-10 → 2026-03-01)]
# Full timeline (audit trail for compliance)
print(kg.timeline("EM-exposure-limit"))
# → 2025-01-10: approved_at 12%
# → 2026-03-01: approved_at updated to 8% (12% invalidated)React hook for the compliance dashboard, building on useMemPalace:
// hooks/usePortfolioKG.ts — Temporal portfolio queries
import { useEffect, useState } from 'react';
import { useMemPalace, Triple } from './useMemPalace';
interface UsePortfolioKGOptions {
entity: string;
asOf?: string; // ISO date — enables historical queries
}
export function usePortfolioKG({ entity, asOf }: UsePortfolioKGOptions) {
const { kgQuery } = useMemPalace();
const [triples, setTriples] = useState<Triple[]>([]);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
useEffect(() => {
setLoading(true);
setError(null);
kgQuery(entity, asOf)
.then(setTriples)
.catch(err => setError(err.message))
.finally(() => setLoading(false));
}, [entity, asOf, kgQuery]);
/** Returns the most recent value for a given predicate */
const getLatest = (predicate: string): string | undefined =>
triples
.filter(t => t.predicate === predicate && !t.valid_to)
.sort((a, b) => b.valid_from.localeCompare(a.valid_from))
[0]?.object;
const getHistory = (predicate: string): Triple[] =>
triples
.filter(t => t.predicate === predicate)
.sort((a, b) => a.valid_from.localeCompare(b.valid_from));
return { triples, loading, error, getLatest, getHistory };
}
// ─── Usage in your ETF Compliance Widget ───────────────
function EMExposureWidget({ auditDate }: { auditDate?: string }) {
const { getLatest, getHistory, loading, error } = usePortfolioKG({
entity: 'EM-exposure-limit',
asOf: auditDate,
});
if (loading) return <div>Loading palace...</div>;
if (error) return <div className="error">Palace query failed: {error}</div>;
const currentLimit = getLatest('approved_at');
const history = getHistory('approved_at');
return (
<div className="compliance-widget">
<h3>EM Exposure Limit {auditDate ? `(as of ${auditDate})` : '(current)'}</h3>
<p className="limit-value">{currentLimit}</p>
<ul className="history">
{history.map((t, i) => (
<li key={i}>
{t.valid_from}: {t.object}{t.valid_to ? ` → ${t.valid_to}` : ' (active)'}
</li>
))}
</ul>
</div>
);
}Chapter 7: Multi-Agent Pipeline with Shared Memory
The real power unlock for FinOps teams. MemPalace as the shared memory bus for a multi-agent expense analysis pipeline.
# agents/finops_pipeline.py — 3-agent FinOps with shared memory (conceptual)
# Note: Verify MemPalaceClient import path against current package
from mempalace.client import MemPalaceClient
from anthropic import Anthropic
import asyncio
import json
from datetime import date
palace = MemPalaceClient(wing="finops")
ai = Anthropic()
def today_iso() -> str:
return date.today().isoformat()
def parse_json_response(text: str) -> list[dict]:
"""Safely parse JSON from LLM response."""
try:
cleaned = text.strip()
if cleaned.startswith("```"):
cleaned = cleaned.split("\n", 1)[1].rsplit("```", 1)[0]
return json.loads(cleaned)
except json.JSONDecodeError:
return []
# ── Agent 1: Anomaly Detector ──────────────────────────
async def anomaly_detector(expenses: list[dict]) -> list[dict]:
"""Detects spend anomalies; writes findings to palace/risk room."""
context = await palace.search(
"expense anomaly patterns vendor overspend", room="risk"
)
msg = ai.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=f"""You detect expense anomalies for a FinOps team.
PALACE CONTEXT (historical patterns):
{context}
Flag items exceeding 2σ from 90-day baseline. Return JSON array.""",
messages=[{"role": "user", "content": str(expenses)}]
)
anomalies = parse_json_response(msg.content[0].text)
await palace.store(
content=f"Anomalies detected {today_iso()}: {json.dumps(anomalies)}",
room="risk"
)
return anomalies
# ── Agent 2: Root Cause Analyst ───────────────────────
async def root_cause_analyst(anomalies: list[dict]) -> list[dict]:
"""Investigates root causes; reads Agent 1 output from palace."""
context = await palace.search(
"vendor contract renewal budget allocation decision", room="decisions"
)
msg = ai.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=f"""You are a FinOps root cause analyst.
PALACE CONTEXT (past vendor decisions):
{context}
For each anomaly: identify likely cause, reference any historical
decisions that explain it. Return JSON with cause + evidence.""",
messages=[{"role": "user", "content": str(anomalies)}]
)
analyses = parse_json_response(msg.content[0].text)
await palace.store(
content=f"Root cause analysis {today_iso()}: {json.dumps(analyses)}",
room="decisions"
)
return analyses
# ── Agent 3: Report Generator ─────────────────────────
async def report_generator(analyses: list[dict]) -> str:
"""Generates CFO-ready report using full palace context."""
team_ctx = await palace.search("CFO report format preferences", room="facts")
anomaly_ctx = await palace.search("anomaly findings today", room="risk")
msg = ai.messages.create(
model="claude-sonnet-4-6",
max_tokens=2000,
system=f"""Generate a CFO-ready FinOps expense report.
TEAM PREFERENCES: {team_ctx}
ANOMALY CONTEXT: {anomaly_ctx}
Include executive summary, anomaly table, root causes,
and recommended actions. Markdown format.""",
messages=[{"role": "user", "content": str(analyses)}]
)
return msg.content[0].text
# ── Orchestrator ──────────────────────────────────────
async def run_finops_pipeline(monthly_expenses: list[dict]) -> str:
anomalies = await anomaly_detector(monthly_expenses)
analyses = await root_cause_analyst(anomalies)
report = await report_generator(analyses)
return report
# Every agent enriches the palace — next month's run is smarter.Final Verdict: Ship It or Skip It?
✅ Ship It If…
- Building AI copilots for regulated FinTech environments where data sovereignty is non-negotiable
- You need temporal audit trails (SOX, MiFID II compliance)
- Multi-agent pipelines need shared persistent memory
- Budget is a concern — $0 vs $249/month
- You’re comfortable running early-stage open-source software and contributing fixes
⚠️ Skip It (For Now) If…
- You need a managed cloud SLA or enterprise support
- Your team can’t run a local Python MCP server
- You need battle-tested production support today — this project is one week old
- Windows deployment — there’s an active Unicode crash bug (GitHub Issue #47)
- You need MCP with Claude Desktop — stdout bug (Issue #225) corrupts the message stream
The benchmark controversy is real but the community has handled it well — the authors corrected their README within 48 hours and published detailed, honest benchmarks documentation. The 96.6% recall in raw mode is independently reproducible, even though it’s ChromaDB doing the heavy lifting rather than the palace architecture itself.
As a FinTech engineer who has spent years battling the stateless AI amnesia problem in enterprise trading platforms: the local-first, compliance-friendly, zero-cost angle alone makes MemPalace worth evaluating for your AI tooling stack. The temporal knowledge graph is a genuinely useful feature for regulated financial software, even in its current early state.
Yes, it’s early. Yes, the README was overhyped on launch. Yes, some documented features don’t exist in the codebase yet (Issue #27). But the core architecture is sound, the benchmark methodology has been transparently corrected, and the active community (23,000+ stars, hundreds of PRs) suggests this project has legs.
The question isn’t whether MemPalace is production-ready today — it’s not. The question is whether the architectural approach (verbatim storage + spatial hierarchy + temporal KG) is worth investing in for your FinTech AI stack. My answer: yes, with eyes open.
Resources & Further Reading
- GitHub Repo — github.com/milla-jovovich/mempalace
- Benchmark Details — BENCHMARKS.md (read this before citing any numbers)
- Issue #27 — README vs. Codebase — github.com/milla-jovovich/mempalace/issues/27
- Benchmark Analysis (Medium) — MemPalace: What the Benchmarks Actually Mean — Ewan Mak
- LongMemEval Benchmark — github.com/xiaowu0162/LongMemEval