Alerting
crocbot includes a lightweight alerting system for proactive error detection and notification. The system classifies errors by severity, deduplicates repeated errors, rate limits notifications to prevent alert storms, and delivers alerts through configurable channels.Overview
The alerting system operates in three stages:- Classification - Errors are automatically classified as
critical,warning, orinfobased on content - Aggregation - Duplicate errors are deduplicated within a configurable window, and notifications are rate limited per severity
- Notification - Alerts are dispatched to configured channels (webhook, Telegram)
Configuration
Add alerting configuration to yourconfig.yaml under the gateway.alerting key:
Severity Levels
Errors are classified into three severity levels:| Severity | Description | Examples |
|---|---|---|
critical | Requires immediate attention | Fatal errors, crashes, auth failures, database errors |
warning | Degraded operation | Timeouts, rate limits, retries, network issues |
info | Informational | General errors not matching other categories |
Automatic Classification
The system automatically classifies errors based on keywords: Critical keywords: fatal, crash, unhandled, uncaught, panic, oom, out of memory, connection refused, auth failed, authentication failed, token invalid, token expired, database connection, database error Warning keywords: timeout, timed out, retry, retrying, rate limit, slow, degraded, failed to, could not, unable to, network error, connection resetDeduplication
To prevent alert fatigue, identical errors within the deduplication window are aggregated:- First occurrence triggers an alert with
count: 1 - Subsequent identical errors increment the count but do not trigger new alerts
- After the deduplication window expires, a new alert can be triggered
Rate Limiting
Even with deduplication, error bursts can occur. Rate limiting provides a secondary protection:| Severity | Default Limit | Window |
|---|---|---|
| Critical | 5 alerts | 5 minutes |
| Warning | 10 alerts | 5 minutes |
| Info | Unlimited | N/A |
Notification Channels
Webhook
The webhook notifier sends a POST request with a JSON payload:| Option | Type | Default | Description |
|---|---|---|---|
url | string | (required) | Webhook endpoint URL |
headers | object | {} | Custom headers (e.g., auth) |
timeoutMs | number | 5000 | Request timeout |
Telegram Self-Notification
Send alerts to yourself via Telegram using your existing bot:| Option | Type | Default | Description |
|---|---|---|---|
chatId | string | (required) | Telegram chat ID to send alerts |
accountId | string | "default" | Telegram account ID |
minSeverity | string | "critical" | Minimum severity to trigger notification |
@userinfobot.
Testing
Webhook Test Endpoint
crocbot provides a test endpoint at/alerts/webhook that echoes received webhooks. Configure your local gateway as the webhook target for testing:
Triggering Test Alerts
You can trigger a test alert programmatically:Metrics Integration
The alerting system integrates with the metrics endpoint:- Errors increment
crocbot_errors_totalcounter with aseveritylabel - Use Prometheus/Grafana to visualize error rates by severity
Disabling Alerting
To disable alerting completely:Best Practices
- Start with critical only - Begin with
minSeverity: criticalfor Telegram to avoid notification fatigue - Use webhooks for integration - Connect to PagerDuty, OpsGenie, or Slack via webhooks
- Monitor rate limit hits - If you frequently hit rate limits, investigate the root cause
- Review deduplication window - Adjust based on your error patterns
- Test before production - Use the
/alerts/webhookendpoint to verify configuration
