Architecture
System Overview
Crocbot is a personal AI assistant with a Gateway control plane and Telegram integration. It follows a lean, single-user deployment model optimized for VPS/Docker hosting.Dependency Graph
Components
Gateway
- Purpose: Central control plane for sessions, Telegram, tools, and events
- Tech: Node.js, WebSocket, Express
- Location:
src/gateway/
CLI
- Purpose: Command-line interface for gateway management and agent invocation
- Tech: Node.js, commander
- Location:
src/cli/,src/commands/
Telegram Channel
- Purpose: Full bot integration via grammY with groups, DMs, media, and inline model selection
- Tech: grammY, @grammyjs/runner
- Location:
src/telegram/,src/channels/ - Key modules:
bot-handlers.ts(callbacks),model-buttons.ts(inline keyboards),network-errors.ts(Grammy timeout recovery),monitor.ts(scoped rejection handler)
Agent Runtime
- Purpose: Pi embedded runtime with tool streaming and block streaming
- Tech: TypeScript, RPC mode
- Location:
src/agents/ - Key modules:
session-transcript-repair.ts(JSONL repair, tool call sanitization),session-file-repair.ts(crash-resilient file recovery)
Media Pipeline
- Purpose: Image/audio/video processing, transcription, size caps
- Tech: Node.js streams, temp file lifecycle
- Location:
src/media/
Security Layer
- Purpose: SSRF protection, path traversal validation, exec allowlisting
- Tech: DNS pinning, IP range blocking, AbortSignal timeouts
- Location:
src/infra/net/(ssrf.ts, fetch-guard.ts),src/infra/exec-approvals.ts - Key modules:
ssrf.ts(private IP/hostname blocking, redirect validation),fetch-guard.ts(guarded fetch wrapper),exec-approvals.ts(shell token blocking, allowlist enforcement)
Secrets Masking (src/infra/secrets/)
- Purpose: Prevent credential leakage across all output boundaries
- Tech: Custom Aho-Corasick masker, SecretsRegistry singleton, value-based + pattern-based defense-in-depth
- Key modules:
registry.ts(singleton, auto-discovery from env/config),masker.ts(Aho-Corasick for 10+ patterns, sequential fallback),stream-masker.ts(cross-chunk boundary detection),logging-transport.ts(tslog masking transport),llm-masking.ts(context wrapper),tool-result-masking.ts(agent tool output),error-masking.ts(error messages) - Boundaries: (1) Logging, (2) Config snapshots, (3) LLM context, (4) Streaming output, (5) Tool results, (6) Telegram send, (7) Error formatting
Logging & Observability
- Purpose: Structured logging, metrics, error alerting
- Tech: tslog (with secrets masking transport), OpenTelemetry-compatible metrics
- Location:
src/logging/,src/metrics/,src/alerting/
Plugin System
- Purpose: Extensible plugin runtime with SDK
- Tech: TypeScript plugin loader
- Location:
src/plugins/,src/plugin-sdk/
Cron Scheduler
- Purpose: Scheduled jobs and wakeups
- Tech: Node.js timers, JSONL persistence
- Location:
src/cron/
Memory
- Purpose: Conversation memory and context management
- Tech: File-based storage
- Location:
src/memory/
Rate Limiting (src/infra/)
- Purpose: Per-provider rate limiting, API key rotation, and transient error retry
- Tech: Sliding window log algorithm (RPM/TPM), health-aware round-robin key pool, exponential backoff with jitter
- Key modules:
provider-rate-limiter.ts(sliding window RPM/TPM enforcement),key-pool.ts(health-aware round-robin with rate limiter integration),llm-retry.ts(transient error classification and Retry-After parsing),rate-limit-middleware.ts(pre-flight/post-flight wrapper for LLM call sites) - Composition: Four-layer pipeline — ProviderRateLimiter (sliding window) -> KeyPool (key selection) -> retryAsync + createLlmRetryOptions (transient retry) -> withRateLimitCheck (call-site middleware). Zero overhead when no limits configured (pass-through mode).
MCP Client
- Purpose: In-process client connecting to external MCP tool servers
- Tech: @modelcontextprotocol/sdk, stdio/SSE/HTTP transports
- Location:
src/mcp/(client.ts, client-transport.ts, transport-*.ts) - Key modules:
client.ts(lifecycle manager with reconnect),tool-bridge.ts(MCP-to-agent tool conversion),transport-ssrf.ts(SSRF-guarded fetch for remote transports)
MCP Server
- Purpose: Exposes crocbot as MCP infrastructure for external AI systems
- Tech: @modelcontextprotocol/sdk, SSE + streamable HTTP transports
- Location:
src/mcp/(server.ts, server-auth.ts, server-tools.ts, server-mount.ts) - Key modules:
server-auth.ts(Bearer token with timing-safe comparison),server-tools.ts(send_message, finish_chat, query_memory, list_capabilities),server-mount.ts(HTTP route mounting)
Tech Stack Rationale
| Technology | Purpose | Why Chosen |
|---|---|---|
| Node.js 22+ | Runtime | Modern ESM support, stable LTS |
| TypeScript (ES2023) | Language | Type safety, strict mode, NodeNext modules |
| tsdown (rolldown) | Bundler | ~5s builds, replaces tsc emit |
| pnpm | Package manager | Fast, disk-efficient |
| grammY | Telegram SDK | Modern, well-maintained, middleware support |
| Vitest | Testing | Fast, ESM-native, good DX |
| @modelcontextprotocol/sdk | MCP Client + Server | Official TypeScript SDK for Model Context Protocol |
| oxlint + oxfmt | Lint + Format | Fast Rust-based toolchain, 134 type-aware rules |
Data Flow
- User sends message via Telegram (or external AI calls MCP server endpoint)
- grammY bot receives message, routes to Gateway
- Gateway creates/resumes session, invokes agent
- Agent builds context: system prompt + memory recall + conversation history
- SecretsRegistry masks any credentials in LLM context before provider call
- Rate limiter checks RPM/TPM capacity; KeyPool selects best API key via round-robin
- Agent processes message with LLM provider (transient errors retried with backoff), invoking MCP client tools as needed
- StreamMasker masks secrets in streaming response chunks (cross-boundary detection)
- Tool results masked before persistence and display
- Response streamed back through Gateway to Telegram (masked) and persisted to session transcript (masked)
Key Architectural Decisions
See Architecture Decision Records for detailed history:- ADR-0001: Telegram-only Architecture
- ADR-0002: Multi-stage Docker Build
- ADR-0003: Native MCP Integration
- ADR-0004: Secrets Masking Pipeline
- ADR-0005: Per-Provider Rate Limiting
