TOKI: A Bitemporal Operator Algebra for Contradiction Resolution in LLM-Agent Persistent Memory

📅 2026-06-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

163K/year
🤖 AI Summary
This work addresses the challenge of maintaining consistent persistent memory in large language model (LLM) agents under frequent belief updates, where existing conflict-resolution heuristics lack formal isolation guarantees. The authors model this problem as write-time concurrency control and introduce TOKI, a bitemporal operational algebra that unifies four classes of heuristic strategies while ensuring semantic consistency through isolation preconditions and provenance annotations. They establish the first formal correctness contract for LLM memory conflicts, encompassing isolation, schema, and provenance properties, and prove its scalability over operation pipelines and n-ary conflict sets. Experiments on the LoCoMo natural workload demonstrate that TOKI is the only approach that simultaneously avoids three classes of write anomalies while preserving the LLM’s adjudication capability; its audit-row mechanism improves accuracy by 0.86, whereas removing the typed memory layer reduces accuracy by 0.49 across 1,444 problems.
📝 Abstract
Persistent memory for an LLM agent is a write-heavy substrate: every belief update is a versioned write, and a new claim may contradict a stored one. Production systems use four resolution heuristics (last-writer-wins, evidence-weighted merge, await-confirmation, per-rule policy), yet none declares the isolation level it assumes or the write-time anomalies it admits. We show that contradiction resolution is write-time concurrency control and make the missing contract explicit. TOKI types the four heuristics as one family of bitemporal operators over a dual-row schema, each with an isolation precondition and a provenance annotation that preserves the losing fact in an audit row. Four soundness theorems close the contract across isolation, schema, and provenance, lift the guarantees to operator pipelines, and extend the fold operators to n-ary conflict sets. A tightness companion proves that, within the relational schedule model, keyed logging of the adjudicating judge is necessary for replay consistency, which every audited baseline omits. A verdict matrix over eight systems localizes the gap: every baseline that keeps a language-model judge on the write path admits at least one of three write-time anomalies (replay inconsistency, belief-drift skew, audit erasure); a content-addressed engine-layer comparator avoids them only by removing the judge, and TOKI alone excludes all three while keeping it. On its one natural-workload slice the audit-row defence moves LoCoMo by 0.86, and ablating the typed memory layer removes 0.49 accuracy on 1,444 answerable LoCoMo questions; the cross-system comparison stays underpowered and claims no superiority. The contribution is the contract: a write-time correctness specification, proved sound across isolation, schema, and provenance, pinning the guarantee every production heuristic assumes but no deployed system makes explicit.
Problem

Research questions and friction points this paper is trying to address.

contradiction resolution
persistent memory
write-time anomalies
isolation level
LLM agent
Innovation

Methods, ideas, or system contributions that make the work stand out.

bitemporal operator algebra
contradiction resolution
write-time concurrency control
provenance annotation
isolation level