Cerbos provides comprehensive observability features including Prometheus metrics, OpenTelemetry support, and health check endpoints for production monitoring.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/cerbos/cerbos/llms.txt
Use this file to discover all available pages before exploring further.
Health Checks
Cerbos exposes health check endpoints for both HTTP and gRPC protocols to verify service availability.HTTP Health Endpoint
The HTTP health check endpoint is available at/_cerbos/health:
200 OK: Service is healthy and serving requests- Non-200: Service is unavailable or experiencing issues
gRPC Health Check
Cerbos implements the standard gRPC Health Checking Protocol. Use the gRPC health check service:Using the Healthcheck Command
Cerbos includes a built-in healthcheck command for Docker and Kubernetes:--config: Path to Cerbos configuration file--kind: Health check type (grpcorhttp)--host-port: Target host and port--timeout: Health check timeout (default: 2s)--insecure: Skip certificate verification--no-tls: Disable TLS
Docker Healthcheck
Add to your Dockerfile:Kubernetes Probes
Prometheus Metrics
Cerbos exposes Prometheus-compatible metrics at/_cerbos/metrics on the HTTP port (default: 3592).
Enabling Metrics
Metrics are enabled by default. To disable:Scraping Metrics
Key Metrics
Engine Performance
| Metric | Type | Description |
|---|---|---|
cerbos_dev_engine_check_latency | Histogram | Time to evaluate a policy decision (ms) |
cerbos_dev_engine_check_batch_size | Histogram | Distribution of batch sizes in check requests |
cerbos_dev_engine_plan_latency | Histogram | Time to generate a query plan (ms) |
Policy Compilation
| Metric | Type | Description |
|---|---|---|
cerbos_dev_compiler_compile_duration | Histogram | Policy compilation time (ms) |
Storage Operations
| Metric | Type | Description |
|---|---|---|
cerbos_dev_store_poll_count | Counter | Number of times remote store was polled |
cerbos_dev_store_sync_error_count | Counter | Errors during store synchronization |
cerbos_dev_store_last_successful_refresh | Gauge | Timestamp of last successful refresh |
cerbos_dev_store_bundle_op_latency | Histogram | Bundle operation latency (ms) |
cerbos_dev_store_bundle_fetch_errors_count | Counter | Bundle download errors |
cerbos_dev_store_bundle_updates_count | Counter | Bundle updates from remote source |
Cache Performance
| Metric | Type | Description |
|---|---|---|
cerbos_dev_cache_access_count | Counter | Cache access attempts (with result label) |
cerbos_dev_cache_live_objects | Gauge | Number of objects currently in cache |
cerbos_dev_cache_max_size | Gauge | Maximum cache capacity |
Policy Index
| Metric | Type | Description |
|---|---|---|
cerbos_dev_index_entry_count | Gauge | Number of entries in policy index |
cerbos_dev_index_crud_count | Counter | Create/update/delete operations |
Audit Logging
| Metric | Type | Description |
|---|---|---|
cerbos_dev_audit_error_count | Counter | Audit log write errors |
cerbos_dev_audit_oversized_entry_count | Counter | Entries exceeding maximum size |
Cerbos Hub
| Metric | Type | Description |
|---|---|---|
cerbos_dev_hub_connected | Gauge | Connection status (1=connected, 0=disconnected) |
Runtime Metrics
Cerbos automatically exports Go runtime metrics including:- Memory allocation and GC statistics
- Goroutine counts
- CPU usage
Prometheus Configuration
OpenTelemetry Integration
OTLP Metrics
Configure OTLP metrics export using environment variables:Distributed Tracing
Enable distributed tracing to track request flows:| Sampler | Description | Use Case |
|---|---|---|
always_on | Record every trace | Development, debugging |
always_off | No traces recorded | Tracing disabled |
traceidratio | Sample based on trace ID | Production with controlled overhead |
parentbased_always_on | Record if parent sampled | Distributed systems |
parentbased_traceidratio | Ratio-based with parent context | Fine-grained control |
Logging Configuration
Cerbos uses structured logging with configurable log levels.Log Levels
Set via configuration or environment variable:DEBUGorV1,V2, etc. - Verbose debuggingINFO- Standard operational informationWARN- Warning messagesERROR- Error conditions
Log Format
Cerbos automatically detects terminal output:- TTY detected: Colored console output
- Non-TTY: JSON structured logs (ECS format)
Temporary Debug Logging
SendSIGUSR1 signal to temporarily enable debug logging:
CERBOS_TEMP_LOG_LEVEL_DURATION).
Request Payload Logging
For debugging, enable request/response payload logging:Audit Logging
Audit logs capture access decisions and policy evaluations. See the Audit configuration documentation for details.Audit Metrics Integration
Monitor audit log health:Monitoring Best Practices
Critical Alerts
- Service Health: Alert on failed health checks
- High Latency:
cerbos_dev_engine_check_latency> 100ms (p95) - Store Sync Failures:
cerbos_dev_store_sync_error_countincreasing - Audit Errors:
cerbos_dev_audit_error_count> 0 - Hub Disconnection:
cerbos_dev_hub_connected= 0
Performance Monitoring
Dashboard Recommendations
- Overview: Service health, request rate, error rate, latency
- Performance: Latency percentiles, batch sizes, cache metrics
- Storage: Sync status, bundle updates, policy count
- Resources: Memory, CPU, goroutines, GC metrics
Admin API Metrics
When the Admin API is enabled, additional endpoints are available:Observability Stack Examples
Prometheus + Grafana
Prometheus + Grafana
OpenTelemetry Collector
OpenTelemetry Collector
Jaeger Tracing
Jaeger Tracing