A comprehensive Prometheus metrics exporter for Gunicorn WSGI servers with support for multiple worker types and advanced monitoring capabilities by hacking into gunicorn's internal to capture metrics at the webserver layer, featuring innovative Redis-based storage (implemented as part of the Prometheus spec), YAML configuration support, and capturing signals as a metric. This Gunicorn worker plugin exports Prometheus metrics to monitor worker performance, including memory usage, CPU usage, request durations, and error tracking (trying to replace https://docs.gunicorn.org/en/stable/instrumentation.html with extra info). It also aims to replace request-level tracking, such as the number of requests made to a particular endpoint, for any framework (e.g., Flask, Django, and others) that conforms to the WSGI specification.
One of the fundamental limitations of the WSGI protocol is that Python frameworks consume errors and exceptions internally. https://peps.python.org/pep-0333/#error-handling Most frameworks (Flask, Django, Pyramid, etc.) handle exceptions within their own middleware and error handling systems, making it difficult to capture comprehensive error metrics at the WSGI level.
This creates a challenge for monitoring tools like ours; we can only capture errors that bubble up to the WSGI layer, while many framework-specific errors are handled internally and never reach the WSGI interface.
Note: This is a fundamental design choice of the WSGI protocol design.
I am trying to implement two-tier error tracking system:
- WSGI-Level Errors: Captured at the worker level for errors that reach the WSGI interface
- Framework Integration: Designed to work with framework-specific error handlers when available
Current Error Metrics:
- gunicorn_worker_failed_requests - WSGI-level failed requests
- gunicorn_worker_error_handling - Errors handled by the worker
Current Limitations: Due to WSGI's design, we can only capture errors that bubble up to the WSGI layer. Framework-specific errors (like Django's 404s, Flask's route errors, etc.) are handled internally and never reach our monitoring system.
Future Enhancement: I'm exploring ways to integrate with framework-specific error handlers to capture more comprehensive error metrics. And also, see Issue #67 for request/response payload size tracking per endpoint, this is a nice issue, and LLMs can't figure it out, please try it out if you can!
I've extended the Prometheus Python client to support Redis-based storage as an alternative to traditional multiprocess files. This architectural innovation is made possible by the brilliant protocol-based design of the Prometheus specification, which allows for clean storage backend replacement through the StorageDictProtocol interface.
The Prometheus multiprocess specification's protocol-based design enables us to seamlessly replace the default file-based storage (MmapedDict) with our Redis implementation (RedisStorageDict) without breaking compatibility. This is a testament to the excellent engineering behind the Prometheus ecosystem.
Storage Location | Local files | Redis server |
Scalability | Single server | Multiple servers |
File I/O | High overhead | No file I/O |
Shared Metrics | No | Yes |
Storage Separation | Coupled | Separated |
Protocol Compliance | MmapedDict | RedisStorageDict |
- Microservices Architecture: Multiple services sharing metrics
- Container Orchestration: Kubernetes pods with shared Redis
- High Availability: Metrics survive server restarts
- Cost Optimization: Separate storage and compute resources
- Sidecar Deployment: Deploy as sidecar container in the same pod for isolated monitoring
- Worker Metrics: Memory, CPU, request durations, error tracking
- Master Process Intelligence: Signal tracking, restart analytics
- Multiprocess Support: Full Prometheus multiprocess compatibility
- Redis Storage: Store metrics directly in Redis (no files created)
- YAML Configuration: Structured, readable configuration management with environment variable override
- Protocol-Based Design: Leverages Prometheus specification's brilliant protocol architecture
- Zero Configuration: Works out-of-the-box with minimal setup
- Production Ready: Retry logic, error handling, health monitoring
Basic installation (sync and thread workers only):
With async worker support:
With Redis storage:
Complete installation (all features):
The published container lives at princekrroshan01/gunicorn-prometheus-exporter. See the Docker Hub listing for tags and architecture support: https://hub.docker.com/r/princekrroshan01/gunicorn-prometheus-exporter
The container exposes metrics on 0.0.0.0:9091 by default. Override behaviour via environment variables such as PROMETHEUS_METRICS_PORT, PROMETHEUS_BIND_ADDRESS, and PROMETHEUS_MULTIPROC_DIR.
For the sidecar pattern, reuse the manifest under Deployment Options → Sidecar Deployment and reference the same image/tag.
Create a YAML configuration file (gunicorn-prometheus-exporter.yml):
Create a Gunicorn config file (gunicorn.conf.py):
Create a Gunicorn config file (gunicorn.conf.py):
Worker Count Guidelines:
- Sync workers: 2 × CPU cores + 1 (classic formula for I/O-bound apps)
- Async workers: 1-4 workers (each handles many concurrent connections)
- CPU-bound workloads: Use closer to CPU core count
- Memory considerations: Each worker consumes ~50-100MB RAM
- Monitor and adjust: Start with the formula, then tune based on your app's behavior
The exporter supports all major Gunicorn worker types:
PrometheusWorker | Pre-fork (sync) | Simple, reliable, 1 request per worker | pip install gunicorn-prometheus-exporter |
PrometheusThreadWorker | Threads | I/O-bound apps, better concurrency | pip install gunicorn-prometheus-exporter |
PrometheusEventletWorker | Greenlets | Async I/O with eventlet | pip install gunicorn-prometheus-exporter[eventlet] |
PrometheusGeventWorker | Greenlets | Async I/O with gevent | pip install gunicorn-prometheus-exporter[gevent] |
Metrics are automatically exposed on the configured bind address and port (default: 0.0.0.0:9091):
Complete documentation is available at: https://princekrroshan01.github.io/gunicorn-prometheus-exporter
The documentation includes:
- Installation and configuration guides
- YAML configuration guide with examples
- Complete metrics reference
- Framework-specific examples (Django, FastAPI, Flask, Pyramid)
- API reference and troubleshooting
- Contributing guidelines
The Gunicorn Prometheus Exporter provides comprehensive metrics for monitoring both worker processes and the master process. All metrics include appropriate labels for detailed analysis.
-
gunicorn_worker_requests_total - Total number of requests handled by each worker
- Labels: worker_id
- Type: Counter
-
gunicorn_worker_request_duration_seconds - Request duration histogram
- Labels: worker_id
- Type: Histogram
- Buckets: 0.1, 0.5, 1.0, 2.5, 5.0, 10.0, 30.0, 60.0, +Inf
-
gunicorn_worker_request_size_bytes - Request size histogram
- Labels: worker_id
- Type: Histogram
- Buckets: 1KB, 4KB, 16KB, 64KB, 256KB, 1MB, 4MB, +Inf
-
gunicorn_worker_response_size_bytes - Response size histogram
- Labels: worker_id
- Type: Histogram
- Buckets: 1KB, 4KB, 16KB, 64KB, 256KB, 1MB, 4MB, +Inf
-
gunicorn_worker_failed_requests - Total number of failed requests
- Labels: worker_id, method, endpoint, error_type
- Type: Counter
-
gunicorn_worker_error_handling - Total number of errors handled
- Labels: worker_id, method, endpoint, error_type
- Type: Counter
-
gunicorn_worker_memory_bytes - Memory usage per worker
- Labels: worker_id
- Type: Gauge
-
gunicorn_worker_cpu_percent - CPU usage per worker
- Labels: worker_id
- Type: Gauge
-
gunicorn_worker_uptime_seconds - Worker uptime
- Labels: worker_id
- Type: Gauge
- gunicorn_worker_state - Current state of the worker
- Labels: worker_id, state, timestamp
- Type: Gauge
- Values: 1=running, 0=stopped
-
gunicorn_worker_restart_total - Total worker restarts by reason
- Labels: worker_id, reason
- Type: Counter
-
gunicorn_worker_restart_count_total - Worker restarts by type and reason
- Labels: worker_id, restart_type, reason
- Type: Counter
-
gunicorn_master_worker_restart_total - Total worker restarts by reason
- Labels: reason
- Type: Counter
- Common reasons: hup, usr1, usr2, ttin, ttou, chld, int
-
gunicorn_master_worker_restart_count_total - Worker restarts by worker and reason
- Labels: worker_id, reason, restart_type
- Type: Counter
- worker_id: Unique identifier for each worker process
- method: HTTP method (GET, POST, PUT, DELETE, etc.)
- endpoint: Request endpoint/path
- error_type: Type of error (exception class name)
- state: Worker state (running, stopped, etc.)
- timestamp: Unix timestamp of state change
- reason: Reason for restart (signal name or error type)
- restart_type: Type of restart (signal, error, manual, etc.)
- reason: Signal or reason that triggered the restart
- hup: HUP signal (reload configuration)
- usr1: USR1 signal (reopen log files)
- usr2: USR2 signal (upgrade on the fly)
- ttin: TTIN signal (increase worker count)
- ttou: TTOU signal (decrease worker count)
- chld: CHLD signal (child process status change)
- int: INT signal (interrupt/Ctrl+C)
See the example/ directory for complete working examples with all worker types:
- gunicorn_simple.conf.py: Basic sync worker setup
- gunicorn_thread_worker.conf.py: Threaded workers for I/O-bound apps
- gunicorn_redis_integration.conf.py: Redis storage setup (no files)
- gunicorn_eventlet_async.conf.py: Eventlet workers with async app
- gunicorn_gevent_async.conf.py: Gevent workers with async app
- app.py: Simple Flask app for sync/thread workers
- async_app.py: Async-compatible Flask app for async workers
Run any example with:
All worker types have been thoroughly tested and are production-ready:
Sync Worker | Working | All metrics | HUP, USR1, CHLD | Balanced |
Thread Worker | Working | All metrics | HUP, USR1, CHLD | Balanced |
Eventlet Worker | Working | All metrics | HUP, USR1, CHLD | Balanced |
Gevent Worker | Working | All metrics | HUP, USR1, CHLD | Balanced |
All async workers require their respective dependencies:
- Eventlet: pip install eventlet
- Gevent: pip install gevent
Create a YAML configuration file for structured, readable configuration:
Load YAML configuration in your Gunicorn config:
Environment variables can override YAML configuration values:
PROMETHEUS_METRICS_PORT | 9091 | Port for metrics endpoint |
PROMETHEUS_BIND_ADDRESS | 0.0.0.0 | Bind address for metrics |
GUNICORN_WORKERS | 1 | Number of workers |
PROMETHEUS_MULTIPROC_DIR | Auto-generated | Multiprocess directory |
REDIS_ENABLED | false | Enable Redis storage (no files created) |
REDIS_HOST | 127.0.0.1 | Redis server hostname |
REDIS_PORT | 6379 | Redis server port |
REDIS_DB | 0 | Redis database number |
REDIS_PASSWORD | (none) | Redis password (optional) |
REDIS_KEY_PREFIX | gunicorn | Prefix for Redis keys |
- Local Development: See Deployment Guide
- Docker: See Docker Deployment
- Kubernetes: See Kubernetes Deployment
The exporter supports two main Kubernetes deployment patterns:
Deploy the exporter as a sidecar container within the same Kubernetes pod for isolated monitoring:
Benefits:
- Isolation: Metrics collection separate from application logic
- Resource Management: Independent resource limits
- Security: Reduced attack surface
- Maintenance: Update monitoring independently
Deploy the exporter as a DaemonSet for cluster-wide infrastructure monitoring:
Benefits:
- Cluster Coverage: One pod per node for complete cluster monitoring
- Infrastructure Monitoring: Node-level application insights
- Automatic Scaling: Scales automatically with cluster size
- Host Network Access: Direct access to node-level services
Use Case | Application-specific monitoring | Cluster-wide infrastructure monitoring |
Scaling | Manual replica scaling | Automatic (one per node) |
Network | ClusterIP services | Host network access |
Coverage | Specific applications | All applications on all nodes |
Resource | Shared across pods | Dedicated per node |
Best For | Production applications | Infrastructure monitoring, development environments |
Manifest Location | k8s/sidecar-deployment.yaml | k8s/sidecar-daemonset.yaml |
Find complete Kubernetes manifests in the k8s/ directory:
- Sidecar Deployment: k8s/sidecar-deployment.yaml
- DaemonSet Deployment: k8s/sidecar-daemonset.yaml
- Services: k8s/daemonset-service.yaml, k8s/daemonset-metrics-service.yaml
- Network Policies: k8s/daemonset-netpol.yaml
- Complete Setup: See k8s/README.md for full deployment guide
We're actively testing and will add support for:
- Helm Charts - Kubernetes package management
- Terraform - Infrastructure as Code
- Ansible - Configuration management
- AWS ECS/Fargate - Container orchestration
- Google Cloud Run - Serverless containers
- Azure Container Instances - Managed containers
See the Deployment Guide for complete deployment options and configurations.
This project follows the Test Pyramid with comprehensive testing at all levels:
Test Coverage:
- ✅ Unit tests (tests/) - pytest-based function testing
- ✅ Integration tests (integration/) - Component integration
- ✅ E2E tests (e2e/) - Docker + Kubernetes deployment
- ✅ Redis integration and storage
- ✅ Multi-worker Gunicorn setup
- ✅ All metric types (counters, gauges, histograms)
- ✅ Request processing and metrics capture
- ✅ Signal handling and graceful shutdown
- ✅ CI/CD automation
See e2e/README.md for detailed E2E test documentation.
Contributions are welcome! Please see our contributing guide for details.
Current Issues: Check our GitHub Issues for known issues and feature requests.
This project is licensed under the MIT License - see the LICENSE file for details.
Production recommendation: All Docker/Kubernetes examples ship with REDIS_ENABLED=true. Redis-backed storage is the supported default for any multi-worker or multi-pod deployment. Only disable Redis when running a single Gunicorn worker for local demos.
See Docker README and Kubernetes Guide for deployment details.