Architecture
System Overview
Crocbot is a personal AI assistant with a Gateway control plane and Telegram integration. It follows a lean, single-user deployment model optimized for VPS/Docker hosting.Dependency Graph
Components
Gateway
- Purpose: Central control plane for sessions, Telegram, tools, and events
- Tech: Node.js, WebSocket, Express
- Location:
src/gateway/ - Stability: Session reset aborts active runs, config merge by ID (preserves array ordering), session key normalization (case-insensitive), WebSocket max payload 5 MB, bounded agent run sequence map, expired hook auth state pruning
CLI
- Purpose: Command-line interface for gateway management and agent invocation
- Tech: Node.js, commander
- Location:
src/cli/,src/commands/
Telegram Channel
- Purpose: Full bot integration via grammY with groups, DMs, media, and inline model selection
- Tech: grammY, @grammyjs/runner
- Location:
src/telegram/,src/channels/ - Key modules:
bot-handlers.ts(callbacks),model-buttons.ts(inline keyboards),network-errors.ts(Grammy timeout recovery),monitor.ts(scoped rejection handler)
Agent Runtime
- Purpose: Pi embedded runtime with tool streaming and block streaming
- Tech: TypeScript, RPC mode
- Location:
src/agents/ - Key modules:
session-transcript-repair.ts(JSONL repair, tool call sanitization),session-file-repair.ts(crash-resilient file recovery) - Stability: Compaction deadlock prevention via
withTimeout(30s default), token accounting fix after compaction (totalTokensupdate), exec override preservation across compaction, tool call ID sanitization for transcript integrity
Media Pipeline
- Purpose: Image/audio/video processing, transcription, size caps
- Tech: Node.js streams, temp file lifecycle
- Location:
src/media/
Security Layer
- Purpose: SSRF protection, path traversal validation, exec allowlisting, input validation, auth hardening
- Tech: DNS pinning, IP range blocking, AbortSignal timeouts, timing-safe comparison
- Location:
src/infra/net/(ssrf.ts, fetch-guard.ts),src/infra/exec-approvals.ts,src/security/,src/gateway/ - Key modules:
ssrf.ts(private IP/hostname blocking, IPv6-mapped bypass prevention, redirect validation),fetch-guard.ts(guarded fetch wrapper),exec-approvals.ts(shell expansion blocking, heredoc handling, allowlist enforcement),security-headers.ts(CSP, X-Frame-Options, nosniff, path traversal filtering),auth-rate-limit.ts(sliding-window per-IP auth rate limiting with lockout),secret-equal.ts(timing-safe token comparison via crypto.timingSafeEqual),path-output.ts(output path containment),http-body.ts(bounded HTTP body reading),base64.ts(oversized base64 rejection) - Security domains: Network/SSRF, filesystem containment, input sanitization, authentication, execution hardening, data leak prevention, ACP tool safety
Secrets Masking (src/infra/secrets/)
- Purpose: Prevent credential leakage across all output boundaries
- Tech: Custom Aho-Corasick masker, SecretsRegistry singleton, value-based + pattern-based defense-in-depth
- Key modules:
registry.ts(singleton, auto-discovery from env/config),masker.ts(Aho-Corasick for 10+ patterns, sequential fallback),stream-masker.ts(cross-chunk boundary detection),logging-transport.ts(tslog masking transport),llm-masking.ts(context wrapper),tool-result-masking.ts(agent tool output),error-masking.ts(error messages) - Boundaries: (1) Logging, (2) Config snapshots, (3) LLM context, (4) Streaming output, (5) Tool results, (6) Telegram send, (7) Error formatting
Runtime Infrastructure (src/infra/)
- Purpose: Cross-cutting runtime utilities for concurrency, timeouts, and memory safety
- Key modules:
async-mutex.ts(Promise-chain mutex replacing proper-lockfile for session locking),with-timeout.ts(genericwithTimeout<T>wrapper with configurable deadline and cleanup) - Memory bounding: Diagnostic session state capped, directory cache bounded with LRU eviction, shell output buffers truncated at configurable limit, abort controller maps bounded, agent run sequence tracking bounded
- Heartbeat hardening: Wake handler race prevention,
runOnceerror recovery (scheduler survives thrown errors), heartbeat exempt from empty-event skip
Logging & Observability
- Purpose: Structured logging, metrics, error alerting
- Tech: tslog (with secrets masking transport), OpenTelemetry-compatible metrics
- Location:
src/logging/,src/metrics/,src/alerting/
Plugin System
- Purpose: Extensible plugin runtime with SDK
- Tech: TypeScript plugin loader
- Location:
src/plugins/,src/plugin-sdk/
Cron Scheduler
- Purpose: Scheduled jobs and wakeups
- Tech: Node.js timers, JSONL persistence
- Location:
src/cron/
Memory
- Purpose: Conversation memory, context management, and AI-powered consolidation
- Tech: SQLite + sqlite-vec (vector similarity), file-based storage, utility model for consolidation/extraction
- Location:
src/memory/ - Key modules:
consolidation.ts(5-action consolidation engine),auto-memorize.ts(post-conversation extraction pipeline),consolidation-schema.ts(schema migration),consolidation-actions.ts(types, DI interfaces) - Consolidation engine: Processes new memory chunks through a pipeline: vector similarity search -> candidate validation -> LLM analysis (utility model) -> atomic DB action (MERGE, REPLACE, KEEP_SEPARATE, UPDATE, SKIP). Safety gates enforce minimum similarity (0.9) for destructive REPLACE. All decisions logged to
consolidation_logaudit table with reasoning, source IDs, and timestamps. - 4-area schema: Memories categorized into
main(general),fragments(facts/preferences),solutions(problem/solution pairs),instruments(tools/techniques). Area metadata stored on each chunk; recall queries filter by area. - Auto-memorize hooks: Fire-and-forget extraction at session end. Three extraction types (solutions, fragments, instruments) run independently via
Promise.allSettled. Budget-aware: each type checks rate limiter before LLM call, skips gracefully when exhausted. Extracted items stored with area metadata, triggering consolidation for dedup. - Composition: AutoMemorize (transcript extraction) -> storeExtractedChunk (categorized storage) -> ConsolidationEngine (dedup pipeline). All LLM calls use
taskType: "consolidation"to route through the utility model role.
Rate Limiting (src/infra/)
- Purpose: Per-provider rate limiting, API key rotation, and transient error retry
- Tech: Sliding window log algorithm (RPM/TPM), health-aware round-robin key pool, exponential backoff with jitter
- Key modules:
provider-rate-limiter.ts(sliding window RPM/TPM enforcement),key-pool.ts(health-aware round-robin with rate limiter integration),llm-retry.ts(transient error classification and Retry-After parsing),rate-limit-middleware.ts(pre-flight/post-flight wrapper for LLM call sites) - Composition: Four-layer pipeline — ProviderRateLimiter (sliding window) -> KeyPool (key selection) -> retryAsync + createLlmRetryOptions (transient retry) -> withRateLimitCheck (call-site middleware). Zero overhead when no limits configured (pass-through mode).
Model Roles (src/agents/)
- Purpose: Route LLM calls to specialized models by task type for cost optimization
- Tech: Pattern-based task classification, 2-role architecture (reasoning + utility)
- Key modules:
model-router.ts(ModelRouter interface, createModelRouter factory),task-classifier.ts(fixed task-type-to-role mapping),model-roles.ts(config parsing, role resolution, fallback logic) - Composition: TaskClassifier (call-site classification) -> ModelRouter (role resolution) -> resolveModel (provider/model selection). Utility tasks (compaction, memory-flush, heartbeat, llm-task) route to cheap model; reasoning tasks use primary model. Missing config gracefully degrades to primary model.
MCP Client
- Purpose: In-process client connecting to external MCP tool servers
- Tech: @modelcontextprotocol/sdk, stdio/SSE/HTTP transports
- Location:
src/mcp/(client.ts, client-transport.ts, transport-*.ts) - Key modules:
client.ts(lifecycle manager with reconnect),tool-bridge.ts(MCP-to-agent tool conversion),transport-ssrf.ts(SSRF-guarded fetch for remote transports)
MCP Server
- Purpose: Exposes crocbot as MCP infrastructure for external AI systems
- Tech: @modelcontextprotocol/sdk, SSE + streamable HTTP transports
- Location:
src/mcp/(server.ts, server-auth.ts, server-tools.ts, server-mount.ts) - Key modules:
server-auth.ts(Bearer token with timing-safe comparison),server-tools.ts(send_message, finish_chat, query_memory, list_capabilities),server-mount.ts(HTTP route mounting)
Reasoning Model Support (src/agents/, src/shared/text/)
- Purpose: Native reasoning stream parsing for o1/o3, DeepSeek-R1, and Claude extended thinking
- Tech: Provider-specific stream adapters, tag-based fallback, SQLite trace storage
- Key modules:
reasoning-stream-adapter.ts(per-providerreasoning_deltadetection),chat-generation-result.ts(chunk accumulator separating reasoning from response),reasoning-trace-storage.ts(queryable trace table by session/turn/model),reasoning-tags.ts(XML tag parsing with strict/preserve modes) - Capabilities: Native
reasoning_deltastreaming for providers that support it, tag-based fallback (<think>,<thinking>,<thought>) for others.ChatGenerationResultaccumulator handles cross-chunk boundaries and provides delta computation. Reasoning token budget tracking per session. CLI display modes (on/stream/off), Telegram blocking of reasoning content, WebSocket broadcast of thinking events.
Project Workspaces (src/agents/, src/config/)
- Purpose: Isolated memory, prompts, knowledge base, and sessions per project
- Tech: Project-scoped directory layout under state dir, per-project sqlite-vec indexes
- Key modules:
project-scope.ts(10 exported functions:resolveProjectDir,listProjects,getProjectConfig,setProjectConfig,getActiveProject,setActiveProject,isDefaultProject,resolveProjectMemoryDir,resolveProjectSessionsDir,stripProjectFromSessionKey),types.projects.ts(ProjectConfig, ProjectsConfig types) - Storage layout:
{stateDir}/agents/{agentId}/projects/{projectName}/with subdirectories formemory/,sessions/, and config files. Default project uses agent-level paths (no “default” subdirectory). - Switching: CLI (
crocbot --project <name>,/projectsubcommands), Telegram (/project <name>), RPC (projects.list,projects.current,projects.switch,projects.create,projects.delete).
Knowledge Import Pipeline (src/knowledge/)
- Purpose: Ingest external documents and URLs into project-scoped vector knowledge base
- Tech: Parser registry (strategy pattern), heading-aware chunking, content-hash dedup, sqlite-vec storage
- Key modules:
pipeline.ts(6-stage orchestrator: fetch/parse/chunk/embed/dedup/store),parsers/registry.ts(priority-ordered parser dispatch),chunker.ts(heading-aware splitting with overlap),dedup.ts(hash-first then similarity dedup),storage.ts(sqlite-vec knowledge_chunks/knowledge_vectors schema),state.ts(incremental re-import state machine),incremental.ts(new/unchanged/changed classification) - Parsers:
text-parser.ts(universal fallback),markdown-parser.ts(frontmatter extraction),pdf-parser.ts(pdfjs-dist, lazy-loaded),url-parser.ts(cheerio + node-html-markdown, SSRF-guarded fetch) - CLI:
crocbot knowledge import <source>(URL/file/text,--project,--category,--dry-run,--force,--batch),crocbot knowledge list,crocbot knowledge remove <source> - Composition: ParserRegistry (format detection) -> chunker (heading-aware splitting) -> embedChunksInBatches -> deduplicateChunks (hash + similarity) -> KnowledgeStorage (sqlite-vec). Incremental re-import via content-hash state machine skips unchanged sources.
Tech Stack Rationale
| Technology | Purpose | Why Chosen |
|---|---|---|
| Node.js 22+ | Runtime | Modern ESM support, stable LTS |
| TypeScript (ES2023) | Language | Type safety, strict mode, NodeNext modules |
| tsdown (rolldown) | Bundler | ~5s builds, replaces tsc emit |
| pnpm | Package manager | Fast, disk-efficient |
| grammY | Telegram SDK | Modern, well-maintained, middleware support |
| Vitest | Testing | Fast, ESM-native, good DX |
| @modelcontextprotocol/sdk | MCP Client + Server | Official TypeScript SDK for Model Context Protocol |
| oxlint + oxfmt | Lint + Format | Fast Rust-based toolchain, 134 type-aware rules |
Data Flow
- User sends message via Telegram (or external AI calls MCP server endpoint)
- grammY bot receives message, routes to Gateway
- Gateway creates/resumes session, resolves active project (default or project-scoped)
- Agent builds context: system prompt + project-scoped memory recall + conversation history
- SecretsRegistry masks any credentials in LLM context before provider call
- Model router classifies the task type and resolves the appropriate model (reasoning or utility)
- Rate limiter checks RPM/TPM capacity; KeyPool selects best API key via round-robin
- Agent processes message with LLM provider (transient errors retried with backoff), invoking MCP client tools as needed
- Reasoning adapter separates
reasoning_deltafromtext_deltastreams; traces stored to reasoning_traces table - StreamMasker masks secrets in streaming response chunks (cross-boundary detection)
- Tool results masked before persistence and display
- Response streamed back through Gateway to Telegram (masked) and persisted to project-scoped session transcript (masked)
Key Architectural Decisions
See Architecture Decision Records for detailed history:- ADR-0001: Telegram-only Architecture
- ADR-0002: Multi-stage Docker Build
- ADR-0003: Native MCP Integration
- ADR-0004: Secrets Masking Pipeline
- ADR-0005: Per-Provider Rate Limiting
- ADR-0006: 4-Model-Role Architecture
- ADR-0007: Memory Consolidation Architecture
- ADR-0008: Reasoning Model Support
- ADR-0009: Project-Scoped Workspaces
- ADR-0010: Knowledge Import Pipeline
