7. Memory Consolidation Architecture
Status: Accepted Date: 2026-02-16Context
Phase 12 adds LLM-driven memory consolidation to crocbot. When the agent stores memories, duplicates and near-duplicates accumulate over time — the same fact restated across multiple conversations, partial fragments that could be merged, solutions recorded separately from the problems they solve. Without consolidation, vector search quality degrades as the memory store grows. Agent Zero’s Python implementation (797 lines, FAISS+chromadb) provides the reference design: after each memory save, query for similar existing memories, ask the LLM whether to merge/replace/keep/update/skip, and execute the result. The key question is how to adapt this pattern to crocbot’s TypeScript architecture with SQLite+sqlite-vec, the Phase 11 utility model router, and the Phase 9 secrets masking pipeline.Decision
1. Consolidation-on-save (not batch-only)
Consolidation triggers after each memory indexing operation, following Agent Zero’s pattern. Each new memory is compared against existing similar memories immediately after insertion. This prevents memory bloat from accumulating between periodic batch runs and matches the existing memory-flush pattern inagent-runner-memory.ts, where runMemoryFlushIfNeeded() already fires at pre-compaction time. Batch consolidation may be added later as a secondary mechanism but is not the primary path.
2. Five-action model
Adopt Agent Zero’s 5-action classification directly:- MERGE: Combine two memories into a single richer entry. Both originals are soft-deleted; a new merged entry is inserted. Maps to INSERT + UPDATE (set deleted flag) in SQLite.
- REPLACE: New memory supersedes old. Old entry is soft-deleted. Maps to UPDATE on the old row.
- KEEP_SEPARATE: Memories are related but distinct. No mutation. Both entries remain.
- UPDATE: Enrich existing memory with new information without creating a new entry. Maps to UPDATE on the existing row (content + re-embed).
- SKIP: No action. Universal fallback for any LLM response that fails to parse, times out, or returns an unrecognized action.
3. Dual-threshold similarity strategy
Two similarity thresholds gate which actions are permitted:- 0.7 (discovery threshold): Memories with cosine similarity >= 0.7 are candidates for consolidation. Below this, no LLM call is made.
- 0.9 (destructive threshold): REPLACE and MERGE (destructive actions) require cosine similarity >= 0.9. The LLM may recommend MERGE at 0.75 similarity, but the engine downgrades to KEEP_SEPARATE.
4. Utility model routing
All consolidation LLM calls route through the Phase 11 utility model role viacreateModelRouter(). A new TaskType consolidation in task-classifier.ts maps to the utility role. Consolidation is a mechanical classification task — it does not require reasoning-tier models.
Rate-limit-aware behavior: if the utility model budget is exhausted (per ADR 0005 rate limiting), consolidation is silently skipped for that memory save. The memory is stored unconsolidated and will be picked up by the next successful consolidation cycle. This ensures consolidation never blocks the primary memory-save path.
5. Schema extension strategy
Extend the existingchunks table with 3 new columns via ALTER TABLE ADD COLUMN (non-destructive in SQLite — no table rebuild, no data migration):
area(TEXT DEFAULT ‘main’): Categorization area (see Decision 8)importance(REAL DEFAULT 0.5): LLM-assigned importance score, 0.0-1.0consolidated_from(TEXT, nullable): JSON array of chunk IDs that were merged into this entry
consolidation_log for audit trail: stores every consolidation decision (action taken, source/target chunk IDs, similarity score, LLM reasoning, timestamp). Uses the existing ensureColumn() helper in memory-schema.ts for safe idempotent migration.
6. Secrets masking integration
Consolidation prompts include memory content that may contain credentials, API keys, or tokens. All consolidation prompts and LLM responses pass through the Phase 9 secrets masking pipeline (ADR 0004). Masking applies before sending content to the utility model. Unmasking applies only if the consolidated result is written back to the memory store, preserving the original secret values in stored memories.7. Auto-memorize hook placement
NewrunAutoMemorizeIfNeeded() function in agent-runner-memory.ts, following the exact pattern of the existing runMemoryFlushIfNeeded(). Both functions fire at pre-compaction time in the agent runner loop. The auto-memorize hook runs after memory flush completes (flush produces the memories that consolidation then processes). The function is async, has a 60-second timeout, and catches all errors to prevent consolidation failures from disrupting the agent loop.
8. Four-area categorization
Adopt Agent Zero’s 4 memory areas as metadata on thechunks table area column:
- main: General knowledge, facts, preferences, context
- fragments: Partial information, incomplete thoughts, raw observations
- solutions: Verified working solutions to specific problems
- instruments: Tools, commands, workflows, procedures
area column, allowing queries like “search solutions only” without separate vector indices. The LLM assigns area during consolidation; new memories default to main.
Consequences
Enables
- Automatic deduplication on every memory save — no manual cleanup needed
- Structured categorization of memories into searchable areas
- Solution recall: proven fixes are tagged and retrievable by area
- Full audit trail of every consolidation decision via
consolidation_log
Prevents
- Memory bloat from duplicate and near-duplicate entries accumulating over time
- Expensive reasoning model usage on mechanical consolidation tasks
Trade-offs
- Latency on save: 2-3 additional utility model LLM calls per memory item (mitigated by async execution, 60s timeout, silent skip if utility model unavailable)
- LLM may incorrectly merge distinct memories (mitigated by 0.9 threshold for destructive operations, SKIP as universal fallback, consolidation_log for post-hoc audit and reversal)
- Schema migration adds nullable columns to existing table (safe in SQLite but irreversible without table rebuild)
- Consolidation quality depends on utility model capability — lower-tier models may produce worse merge decisions (mitigated by conservative thresholds and SKIP fallback)
Dependencies
- Phase 11 (ADR 0006): Utility model routing via
createModelRouter() - Phase 10 (ADR 0005): Per-provider rate limiting for utility model budget checks
- Phase 9 (ADR 0004): Secrets masking pipeline for memory content in LLM prompts
