Start now →

MemPalace in the Wild: Ancient Memory Meets Modern FinTech Engineering

By Roman Kozak · Published April 15, 2026 · 16 min read · Source: Fintech Tag
TradingAI & Crypto
MemPalace in the Wild: Ancient Memory Meets Modern FinTech Engineering

MemPalace in the Wild: Ancient Memory Meets Modern FinTech Engineering

A critical review of the viral open-source AI memory system — and how frontend engineers building real-time trading dashboards, ETF platforms, and FinOps tooling can actually deploy it today.

Roman KozakRoman Kozak14 min read·1 hour ago

--

Press enter or click to view image in full size

I’ve spent over a decade building enterprise FinTech frontends. One thing is consistent across all of them: context is everything, and context is perishable.

Every AI copilot I’ve wired into a trading dashboard suffers the same problem. New session, new AI, no memory.

The risk limits you discussed yesterday? Gone.

The anomaly you flagged last week? Forgotten.

This is why an open-source project that claims to solve AI memory — and went viral overnight — deserves a serious FinTech-lens review.

Chapter 1: What Is MemPalace? The Viral Story

On April 5, 2026, a GitHub repository appeared under the handle milla-jovovich/mempalace. Within 48 hours it had accumulated over 23,000 stars and nearly 3,000 forks, making it the #1 trending repo on GitHub.

The co-author was Milla Jovovich — yes, that Milla Jovovich. The actress spent months working on an unnamed gaming project and ran headlong into the wall every power AI user hits: context-window amnesia. New session, new AI, no memory.

She teamed up with Ben Sigman (CEO of Bitcoin Libre, a Bitcoin lending startup) to engineer the software. Jovovich designed the concept and architecture; Sigman handled the implementation, with the pair vibe-coding extensively using Claude Code. The result: a free, local, MIT-licensed AI memory system that achieves 96.6% recall on LongMemEval — higher than paid cloud alternatives.

The Core Insight — Ancient Greek orators memorized entire speeches by mentally placing ideas inside rooms of an imaginary building. Walk the building, find the idea. MemPalace applies this method of loci to AI memory: conversations become a navigable spatial hierarchy, not a flat search index.

Press enter or click to view image in full size
MemPalace v.3.0.0 key numbers at a glance

Chapter 2: Architecture Deep Dive: Wings, Rooms & Temporal Graphs

Memory Palace Hierarchy

This is what separates MemPalace from every other vector-dump approach. The architecture is a 6-level spatial hierarchy backed by two storage engines: ChromaDB for semantic vector search and SQLite for a temporal knowledge graph.

Press enter or click to view image in full size
Memory Palace hierarchy | 6-level spatial structure | ChromaDB + SQLite

4-Layer Retrieval Stack

On startup, only L0 (critical entity facts) and L1 (recent events) are loaded — just 170 tokens. Deeper layers fire only when needed:

Press enter or click to view image in full size
Wake-Up token budget comparison
# mempalace.yaml — FinTech project config (conceptual)
palace:
name: "northern-trust-etf"
wings:
- name: "etf-dashboard"
rooms: ["components", "api", "realtime", "auth", "deployment"]
- name: "fx-trading"
rooms: ["signalr", "pricing", "risk", "blotter"]
- name: "team"
rooms: ["decisions", "architecture", "reviews"]

retrieval:
startup_layers: ["L0", "L1"] # 170 tokens at wake-up
semantic_top_k: 8
rerank_model: "claude-haiku-4-5-20251001" # optional $0.70/yr

storage:
mode: "raw_verbatim" # keeps every word, no lossy summarization
chunk_size: 800 # chars, paragraph-boundary aware
overlap: 100

knowledge_graph:
temporal: true
# Note: contradiction_detection exists as fact_checker.py
# but is not yet wired into KG operations (as of April 2026)

The Temporal Knowledge Graph

The SQLite knowledge graph is what makes MemPalace genuinely interesting for financial software. Every RDF-style triple carries valid_from and valid_to timestamps — so you can ask "what was the approved risk limit on January 15th?" and get a factually accurate, time-scoped answer.

This directly maps to the audit trail requirements of regulated financial environments. However, note that the current implementation does flat triple lookups — multi-hop graph traversal is not yet supported (tracked in Issue #27).

Chapter 3: Honest Benchmark Review: Hype vs. Reality

Let’s be direct. The community caught real problems within hours of launch, and the authors addressed them publicly in a README correction on April 7. Here’s what the numbers actually mean:

⚠️ Benchmark Caveats — Read These Before You Cite Any Numbers

🔹 The 96.6% raw score measures ChromaDB’s default embedding model on verbatim text, not MemPalace’s spatial structure specifically. The palace hierarchy (wings/rooms/halls) is not involved in this benchmark. This is the single most important fact about these numbers.

🔹 An independent reproduction on M2 Ultra (GitHub Issue #39) confirmed that enabling palace features actually degrades retrieval performance slightly. ChromaDB is doing the heavy lifting.

🔹 The 100% hybrid score was achieved by identifying 3 failing questions, engineering targeted fixes for those specific questions, and retesting on the same set. The held-out score is 98.4%. This is textbook overfitting.

🔹 AAAK compression at small scale increases token count (73 vs 66 tokens). AAAK mode scores only 84.2% on LongMemEval vs 96.6% raw.

🔹 The “+34% retrieval boost” from structure (60.9% → 94.8%) is real but is standard ChromaDB metadata filtering — not a novel architectural contribution.

🔹 Issue #27 documents multiple cases where README claims don’t match the codebase: halls aren’t used in retrieval ranking, no write gating, no input sanitization (prompt injection surface).

Press enter or click to view image in full size
LongMemEval Recall@5 Comparison

Retrieval Mode Performance Breakdown

This chart reveals the key insight: the 96.6% headline score comes from ChromaDB’s raw verbatim mode — the palace structure contributes a +34% improvement over flat search (standard metadata filtering), but the actual 96.6% benchmark doesn’t use the palace at all.

Press enter or click to view image in full size
Retrieval Mode Performance Breakdown

Annual Cost Comparison (Daily Usage, 6-Month Horizon)

Press enter or click to view image in full size
Annual Cost Comparison (Daily Usage, 6-Month Horizon)

FinTech-Lens Comparative Analysis

MemPalace dominates on data sovereignty, cost, and temporal query support — the axes that matter most in regulated FinTech. Mem0 and Zep win decisively on production maturity, which is expected for funded, multi-year projects vs. a one-week-old repo.

Press enter or click to view image in full size
FinTech-Lens Comparative Analysis

Chapter 4: Why FinTech Engineers Should Care

Despite the benchmark caveats, MemPalace addresses real problems in FinTech AI tooling.

Dashboard AI Copilots

Trading dashboards need AI that remembers your risk parameters, preferred chart intervals, and which anomalies you’ve already reviewed — across days of sessions. Without persistent memory, every AI interaction starts from zero.

Temporal Audit Trails

Financial compliance requires knowing who decided what and when. MemPalace’s temporal knowledge graph models this natively — every fact has a validity window. “What was the approved EM exposure limit on March 15th?” becomes a single query.

Data Sovereignty

Regulated environments can’t send client data to external APIs. MemPalace runs 100% locally — no API calls, no data egress. This removes the data sovereignty objection.

Note: local deployment still requires your organization’s standard software security review, change management, and operational risk assessment — “local” doesn’t mean “no compliance process.”

Multi-Agent FinOps

Multi-agent pipelines analyzing expense reports, generating compliance reports, or monitoring real-time risk need shared, persistent context. MemPalace becomes the shared memory bus your agents read from and write to — making every run smarter than the last.

Chapter 5: Practical Integration: Wiring MemPalace to Your FX Dashboard

⚠️ Important Note: The code below is a conceptual integration pattern. MemPalace’s Python API surface is evolving rapidly (it’s one week old). Verify import paths and method signatures against the current mempalace/ package on GitHub before using in production. The MCP server is currently stdio-based; the REST wrapper shown here is a custom addition.

Project Setup

# 1. Install MemPalace
pip install mempalace

# 2. Initialize palace for your project directory
mempalace init ~/projects/fx-trading

# 3. Mine your existing React/TS codebase into the palace
mempalace mine ./src --mode project

# 4. Mine Slack exports (architecture decisions, PR discussions)
mempalace mine ~/exports/slack-fintech.zip --mode convos

# 5. Mine your Claude Code sessions
mempalace mine ~/.claude/projects/fx-trading/ --mode convos

# 6. Search test
mempalace search "SignalR reconnect strategy"

After mining, MemPalace ingests every component, hook, utility, and decision log — organized automatically into rooms based on keyword scoring.

Now wire up the MCP server so your AI tooling (Claude Code, Cursor) auto-queries the palace:

// .claude/mcp_settings.json — MCP integration
// Note: Known stdout bug (Issue #225) affects Claude Desktop.
// Claude Code works correctly as of April 2026.
{
"mcpServers": {
"mempalace": {
"command": "python",
"args": ["-m", "mempalace.mcp_server"],
"env": {
"PALACE_ROOT": "/path/to/your/palace"
}
}
}
}

Once connected, Claude Code automatically gains access to 19 MCP tools: mempalace_search, mempalace_store, mempalace_kg_query, and more.

React Hook — useMemPalace for Trading Copilot

To use MemPalace from a React frontend, you’ll need a thin backend proxy that bridges stdio-based MCP to HTTP. The hook below assumes you’ve set up such a proxy (e.g., Express + @modelcontextprotocol/sdk).

// hooks/useMemPalace.ts
import { useCallback, useRef } from 'react';

// Your backend proxy that wraps the MCP server
const PALACE_API = '/api/mempalace';

export interface MemoryResult {
closet: string;
content: string;
room: string;
wing: string;
score: number;
timestamp: string;
}

export interface Triple {
subject: string;
predicate: string;
object: string;
valid_from: string;
valid_to?: string;
}

export interface UseMemPalaceReturn {
search: (query: string, wing?: string) => Promise<MemoryResult[]>;
store: (content: string, wing: string, room: string) => Promise<void>;
kgQuery: (entity: string, asOf?: string) => Promise<Triple[]>;
}

const MAX_CACHE_SIZE = 100;

export function useMemPalace(): UseMemPalaceReturn {
const cache = useRef<Map<string, MemoryResult[]>>(new Map());

const search = useCallback(
async (query: string, wing = 'fx-trading'): Promise<MemoryResult[]> => {
const key = `${wing}:${query}`;

if (cache.current.has(key)) return cache.current.get(key)!;

try {
const res = await fetch(`${PALACE_API}/search`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ query, wing, top_k: 6 }),
});

if (!res.ok) throw new Error(`Palace search failed: ${res.status}`);

const { results } = await res.json();

// Evict oldest entries if cache exceeds max size
if (cache.current.size >= MAX_CACHE_SIZE) {
const firstKey = cache.current.keys().next().value;

if (firstKey) cache.current.delete(firstKey);
}

cache.current.set(key, results);

return results;
} catch (error) {
console.error('[useMemPalace] search error:', error);

return [];
}
}, []
);

const store = useCallback(
async (content: string, wing: string, room: string): Promise<void> => {
try {
const res = await fetch(`${PALACE_API}/store`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ content, wing, room }),
});

if (!res.ok) throw new Error(`Palace store failed: ${res.status}`);

// Invalidate cache for this wing on new store
[...cache.current.keys()]
.filter(k => k.startsWith(wing))
.forEach(k => cache.current.delete(k));
} catch (error) {
console.error('[useMemPalace] store error:', error);
}
}, []
);

const kgQuery = useCallback(
async (entity: string, asOf?: string): Promise<Triple[]> => {
try {
const res = await fetch(`${PALACE_API}/kg/query`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ entity, as_of: asOf }),
});

if (!res.ok) throw new Error(`Palace KG query failed: ${res.status}`);

const { triples } = await res.json();

return triples;
} catch (error) {
console.error('[useMemPalace] kgQuery error:', error);
return [];
}
}, []
);

return { search, store, kgQuery };
}

FX Copilot Panel

// components/FxCopilot/FxCopilotPanel.tsx
import { useState } from 'react';
import { useMemPalace } from '../../hooks/useMemPalace';

interface CopilotPanelProps {
activePair: string; // e.g. "EUR/USD"
currentRate: number;
}

export function FxCopilotPanel({ activePair, currentRate }: CopilotPanelProps) {
const [query, setQuery] = useState('');
const [answer, setAnswer] = useState<string>('');
const [loading, setLoading] = useState(false);
const [error, setError] = useState<string | null>(null);
const { search, store } = useMemPalace();

const handleQuery = async (userQuestion: string) => {
setLoading(true);
setError(null);

try {
// 1. Retrieve relevant memories from the palace
const memories = await search(
`${activePair} ${userQuestion}`,
'fx-trading'
);

// 2. Build a memory-enriched system prompt
const memContext = memories
.slice(0, 4)
.map(m => `[${m.wing}/${m.room}] ${m.content}`)
.join('\n');

const systemPrompt = `You are a senior FX trading assistant with deep context
about this team's systems and past decisions.

MEMORY PALACE CONTEXT (retrieved for this query):
${memContext}

Current state: ${activePair} @ ${currentRate}
Today: ${new Date().toISOString().split('T')[0]}

Answer concisely, referencing the memory context when relevant.`;

// 3. Call your LLM backend
const response = await fetch('/api/ai/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ system: systemPrompt, message: userQuestion }),
});

if (!response.ok) throw new Error('AI chat request failed');

const { reply } = await response.json();

// 4. Store this exchange in the palace for future sessions
await store(
`Q: ${userQuestion} A: ${reply} [${activePair} @ ${currentRate}]`,
'fx-trading',
'decisions'
);

setAnswer(reply);
} catch (err) {
setError(err instanceof Error ? err.message : 'Something went wrong');
} finally {
setLoading(false);
}
};

return (
<div className="fx-copilot-panel">
<input
value={query}
onChange={e => setQuery(e.target.value)}
onKeyDown={e => e.key === 'Enter' && handleQuery(query)}
placeholder={`Ask about ${activePair}...`}
/>
{loading && <span>Searching palace...</span>}
{error && <span className="error">{error}</span>}
{answer && <p>{answer}</p>}
</div>
);
}

Chapter 6: Temporal Knowledge Graph for Portfolio State

Leveraging the temporal knowledge graph for compliance-critical use cases. In ETF portfolio management, you need to answer: “What was the approved emerging markets exposure limit on March 15th?”

# scripts/etf_kg_seed.py — Seed portfolio facts (conceptual API)
# Verify import paths against the current mempalace package
from mempalace.knowledge_graph import KnowledgeGraph
import json
from datetime import date

kg = KnowledgeGraph()

# ── Team & Responsibility Assignments ──────────────────
kg.add_triple("Sarah Chen", "owns", "EM-exposure-limit", valid_from="2025-01-10")
kg.add_triple("Dev Team", "manages", "etf-dashboard", valid_from="2024-06-01")
kg.add_triple("Alex Rivera", "deployed", "v2.4.1", valid_from="2026-02-15")

# ── Risk Limit History (temporal!) ─────────────────────
# Original limit approved Q1 2025
kg.add_triple("EM-exposure-limit", "approved_at", "12%", valid_from="2025-01-10")

# Board revised downward March 1, 2026 due to volatility
kg.invalidate("EM-exposure-limit", "approved_at", "12%", ended="2026-03-01")
kg.add_triple("EM-exposure-limit", "approved_at", "8%", valid_from="2026-03-01")

# ── Compliance queries ──────────────────────────────────
# Current state
print(kg.query_entity("EM-exposure-limit"))
# → [EM-exposure-limit → approved_at → 8% (current, from 2026-03-01)]

# Historical query — what was true in January 2026?
print(kg.query_entity("EM-exposure-limit", as_of="2026-01-20"))
# → [EM-exposure-limit → approved_at → 12% (valid 2025-01-10 → 2026-03-01)]

# Full timeline (audit trail for compliance)
print(kg.timeline("EM-exposure-limit"))
# → 2025-01-10: approved_at 12%
# → 2026-03-01: approved_at updated to 8% (12% invalidated)

React hook for the compliance dashboard, building on useMemPalace:

// hooks/usePortfolioKG.ts — Temporal portfolio queries
import { useEffect, useState } from 'react';
import { useMemPalace, Triple } from './useMemPalace';

interface UsePortfolioKGOptions {
entity: string;
asOf?: string; // ISO date — enables historical queries
}

export function usePortfolioKG({ entity, asOf }: UsePortfolioKGOptions) {
const { kgQuery } = useMemPalace();
const [triples, setTriples] = useState<Triple[]>([]);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);

useEffect(() => {
setLoading(true);
setError(null);
kgQuery(entity, asOf)
.then(setTriples)
.catch(err => setError(err.message))
.finally(() => setLoading(false));
}, [entity, asOf, kgQuery]);

/** Returns the most recent value for a given predicate */
const getLatest = (predicate: string): string | undefined =>
triples
.filter(t => t.predicate === predicate && !t.valid_to)
.sort((a, b) => b.valid_from.localeCompare(a.valid_from))
[0]?.object;

const getHistory = (predicate: string): Triple[] =>
triples
.filter(t => t.predicate === predicate)
.sort((a, b) => a.valid_from.localeCompare(b.valid_from));

return { triples, loading, error, getLatest, getHistory };
}

// ─── Usage in your ETF Compliance Widget ───────────────
function EMExposureWidget({ auditDate }: { auditDate?: string }) {
const { getLatest, getHistory, loading, error } = usePortfolioKG({
entity: 'EM-exposure-limit',
asOf: auditDate,
});

if (loading) return <div>Loading palace...</div>;
if (error) return <div className="error">Palace query failed: {error}</div>;

const currentLimit = getLatest('approved_at');
const history = getHistory('approved_at');

return (
<div className="compliance-widget">
<h3>EM Exposure Limit {auditDate ? `(as of ${auditDate})` : '(current)'}</h3>
<p className="limit-value">{currentLimit}</p>
<ul className="history">
{history.map((t, i) => (
<li key={i}>
{t.valid_from}: {t.object}{t.valid_to ? ` → ${t.valid_to}` : ' (active)'}
</li>
))}
</ul>
</div>
);
}

Chapter 7: Multi-Agent Pipeline with Shared Memory

The real power unlock for FinOps teams. MemPalace as the shared memory bus for a multi-agent expense analysis pipeline.

# agents/finops_pipeline.py — 3-agent FinOps with shared memory (conceptual)
# Note: Verify MemPalaceClient import path against current package
from mempalace.client import MemPalaceClient
from anthropic import Anthropic
import asyncio
import json
from datetime import date

palace = MemPalaceClient(wing="finops")
ai = Anthropic()

def today_iso() -> str:
return date.today().isoformat()

def parse_json_response(text: str) -> list[dict]:
"""Safely parse JSON from LLM response."""
try:
cleaned = text.strip()
if cleaned.startswith("```"):
cleaned = cleaned.split("\n", 1)[1].rsplit("```", 1)[0]
return json.loads(cleaned)
except json.JSONDecodeError:
return []


# ── Agent 1: Anomaly Detector ──────────────────────────
async def anomaly_detector(expenses: list[dict]) -> list[dict]:
"""Detects spend anomalies; writes findings to palace/risk room."""

context = await palace.search(
"expense anomaly patterns vendor overspend", room="risk"
)

msg = ai.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=f"""You detect expense anomalies for a FinOps team.

PALACE CONTEXT (historical patterns):
{context}

Flag items exceeding 2σ from 90-day baseline. Return JSON array.""",
messages=[{"role": "user", "content": str(expenses)}]
)

anomalies = parse_json_response(msg.content[0].text)

await palace.store(
content=f"Anomalies detected {today_iso()}: {json.dumps(anomalies)}",
room="risk"
)
return anomalies


# ── Agent 2: Root Cause Analyst ───────────────────────
async def root_cause_analyst(anomalies: list[dict]) -> list[dict]:
"""Investigates root causes; reads Agent 1 output from palace."""

context = await palace.search(
"vendor contract renewal budget allocation decision", room="decisions"
)

msg = ai.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system=f"""You are a FinOps root cause analyst.

PALACE CONTEXT (past vendor decisions):
{context}

For each anomaly: identify likely cause, reference any historical
decisions that explain it. Return JSON with cause + evidence.""",
messages=[{"role": "user", "content": str(anomalies)}]
)

analyses = parse_json_response(msg.content[0].text)
await palace.store(
content=f"Root cause analysis {today_iso()}: {json.dumps(analyses)}",
room="decisions"
)
return analyses


# ── Agent 3: Report Generator ─────────────────────────
async def report_generator(analyses: list[dict]) -> str:
"""Generates CFO-ready report using full palace context."""

team_ctx = await palace.search("CFO report format preferences", room="facts")
anomaly_ctx = await palace.search("anomaly findings today", room="risk")

msg = ai.messages.create(
model="claude-sonnet-4-6",
max_tokens=2000,
system=f"""Generate a CFO-ready FinOps expense report.

TEAM PREFERENCES: {team_ctx}
ANOMALY CONTEXT: {anomaly_ctx}

Include executive summary, anomaly table, root causes,
and recommended actions. Markdown format.""",
messages=[{"role": "user", "content": str(analyses)}]
)
return msg.content[0].text


# ── Orchestrator ──────────────────────────────────────
async def run_finops_pipeline(monthly_expenses: list[dict]) -> str:
anomalies = await anomaly_detector(monthly_expenses)
analyses = await root_cause_analyst(anomalies)
report = await report_generator(analyses)
return report

# Every agent enriches the palace — next month's run is smarter.

Final Verdict: Ship It or Skip It?

✅ Ship It If…

⚠️ Skip It (For Now) If…

The benchmark controversy is real but the community has handled it well — the authors corrected their README within 48 hours and published detailed, honest benchmarks documentation. The 96.6% recall in raw mode is independently reproducible, even though it’s ChromaDB doing the heavy lifting rather than the palace architecture itself.

As a FinTech engineer who has spent years battling the stateless AI amnesia problem in enterprise trading platforms: the local-first, compliance-friendly, zero-cost angle alone makes MemPalace worth evaluating for your AI tooling stack. The temporal knowledge graph is a genuinely useful feature for regulated financial software, even in its current early state.

Yes, it’s early. Yes, the README was overhyped on launch. Yes, some documented features don’t exist in the codebase yet (Issue #27). But the core architecture is sound, the benchmark methodology has been transparently corrected, and the active community (23,000+ stars, hundreds of PRs) suggests this project has legs.

The question isn’t whether MemPalace is production-ready today — it’s not. The question is whether the architectural approach (verbatim storage + spatial hierarchy + temporal KG) is worth investing in for your FinTech AI stack. My answer: yes, with eyes open.

Resources & Further Reading

This article was originally published on Fintech Tag and is republished here under RSS syndication for informational purposes. All rights and intellectual property remain with the original author. If you are the author and wish to have this article removed, please contact us at [email protected].

NexaPay — Accept Card Payments, Receive Crypto

No KYC · Instant Settlement · Visa, Mastercard, Apple Pay, Google Pay

Get Started →