Health Check Interpretation
Guide to understanding and troubleshooting health check responses from the crocbot gateway.Health Endpoints Overview
| Endpoint | Purpose | Use Case |
|---|---|---|
/health | Gateway liveness | Platform probes (Docker, Fly.io, k8s) |
/metrics | Prometheus metrics | Monitoring dashboards |
crocbot health | CLI health check | Manual diagnostics |
crocbot status | Full status report | Comprehensive debugging |
Health Endpoint Response
Request
Response
The/health endpoint is a lightweight liveness probe that returns a minimal response:
| Field | Type | Description |
|---|---|---|
status | string | "healthy" if the gateway is running |
"healthy" status confirms the gateway process is alive and accepting HTTP connections. If the endpoint does not respond, the gateway is down.
For detailed diagnostics (memory, uptime, component health), use the CLI:
Interpreting Health Status
Status: Healthy (200 response)
No Response / Connection Refused
Meaning: Gateway is down or unresponsive. Actions:- Check if process is running
- Check for crash in logs
- Restart gateway
- See Startup Shutdown
Memory Monitoring
The/health endpoint does not return memory metrics. Use the CLI or /metrics endpoint for memory monitoring:
Check for Restart Loop
CLI Health Commands
Basic Health Check
JSON Output
With Timeout
Status Command (More Detail)
Platform Health Probes
Docker Healthcheck
In docker-compose.yml or Dockerfile:Fly.io Health Checks
In fly.toml:Kubernetes Probes
Troubleshooting Health Issues
Health Endpoint Not Responding
Connection Refused
Cause: Gateway not running or not bound to expected port/interface.Timeout
Cause: Gateway overloaded or hung.High Memory Suspected
Metrics Endpoint
For detailed operational metrics, use/metrics:
Key Metrics to Monitor
Health Check Script
Alerting on Health Issues
Configure alerting to notify on health degradation:- Gateway crashes
- Authentication failures
- Persistent connection failures
Related Documentation
- Health Checks (CLI) - CLI health commands
- Metrics - Prometheus metrics endpoint
- Alerting - Alert configuration
- Startup Shutdown - Start/stop procedures
- Incident Response - General troubleshooting
