Monitoring Python Backends with Prometheus and Grafana

Apr 15, 2025

—

In the world of backend development, especially when working with Python applications, effective monitoring is crucial for maintaining performance, reliability, and user satisfaction. Two powerful tools that have become industry standards for monitoring are Prometheus and Grafana. This article explores how to implement these tools to monitor your Python backend services effectively.

Why Monitoring Matters for Python Backends

Before diving into the technical details, let’s understand why monitoring is essential for Python backend services:

Performance Optimization: Identify bottlenecks and optimize resource usage
Proactive Issue Detection: Catch problems before they affect users
Capacity Planning: Make data-driven decisions about scaling
Business Insights: Understand usage patterns and user behavior
SLA Compliance: Ensure your services meet agreed-upon service levels

Python backends, whether built with Flask, Django, FastAPI, or other frameworks, benefit significantly from proper monitoring as they scale and handle increasing loads.

Understanding Prometheus and Grafana

Prometheus: The Data Collector

Prometheus is an open-source monitoring and alerting toolkit that excels at collecting and storing time-series data. Key features include:

Pull-based metrics collection model
Flexible query language (PromQL)
No reliance on distributed storage
Built-in alerting capabilities
Service discovery support

Grafana: The Visualization Layer

Grafana is an open-source analytics and visualization platform that pairs perfectly with Prometheus. It provides:

Rich, interactive dashboards
Support for multiple data sources
Alerting and notification features
User management and team collaboration
Template variables for dynamic dashboards

Together, these tools create a powerful monitoring stack that can provide deep insights into your Python backend’s health and performance.

Setting Up Prometheus with Python

Installing the Python Client Library

To expose metrics from your Python application, you’ll need the Prometheus client library:

pip install prometheus-client

Implementing Basic Metrics

Here’s how to implement the four basic types of Prometheus metrics in a Python application:

from prometheus_client import Counter, Gauge, Histogram, Summary

# Counter: Tracks how many times something has happened
api_requests_total = Counter('api_requests_total', 'Total count of API requests', ['endpoint', 'method'])

# Gauge: Represents a value that can go up and down
active_requests = Gauge('active_requests', 'Number of active requests')

# Histogram: Samples observations and counts them in configurable buckets
request_duration = Histogram('request_duration_seconds', 'Request duration in seconds', 
                            ['endpoint'], buckets=[0.1, 0.5, 1, 2, 5, 10])

# Summary: Similar to histogram but calculates quantiles over a sliding time window
request_latency = Summary('request_latency_seconds', 'Request latency in seconds', 
                         ['endpoint'], quantiles=[0.5, 0.9, 0.99])

Integrating with Flask

Here’s an example of how to integrate Prometheus metrics with a Flask application:

from flask import Flask, request
from prometheus_client import Counter, Gauge, generate_latest
from werkzeug.middleware.dispatcher import DispatcherMiddleware
from prometheus_client import make_wsgi_app
import time

app = Flask(__name__)

# Add prometheus wsgi middleware to route /metrics requests
app.wsgi_app = DispatcherMiddleware(app.wsgi_app, {
    '/metrics': make_wsgi_app()
})

# Define metrics
request_count = Counter(
    'flask_request_count', 'App Request Count',
    ['method', 'endpoint', 'http_status']
)
request_latency = Histogram('flask_request_latency_seconds', 'Request latency',
                           ['method', 'endpoint'])
active_requests = Gauge('flask_active_requests', 'Active requests')

@app.before_request
def before_request():
    request.start_time = time.time()
    active_requests.inc()

@app.after_request
def after_request(response):
    request_latency.labels(
        method=request.method,
        endpoint=request.path
    ).observe(time.time() - request.start_time)

    request_count.labels(
        method=request.method,
        endpoint=request.path,
        http_status=response.status_code
    ).inc()

    active_requests.dec()

    return response

@app.route('/')
def hello_world():
    return 'Hello, World!'

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Integrating with Django

For Django applications, you can use django-prometheus package:

pip install django-prometheus

Then update your Django settings:

# settings.py
INSTALLED_APPS = [
    # ...
    'django_prometheus',
    # ...
]

MIDDLEWARE = [
    'django_prometheus.middleware.PrometheusBeforeMiddleware',
    # ... your other middleware ...
    'django_prometheus.middleware.PrometheusAfterMiddleware',
]

# Add URLs
# urls.py
urlpatterns = [
    # ...
    path('', include('django_prometheus.urls')),
]

Integrating with FastAPI

Here’s how to set up Prometheus metrics with FastAPI:

from fastapi import FastAPI, Request
from prometheus_client import Counter, Histogram, generate_latest
from prometheus_fastapi_instrumentator import Instrumentator
import time

app = FastAPI()

# Add prometheus instrumentation
Instrumentator().instrument(app).expose(app)

# Additional custom metrics
api_requests = Counter('api_requests_total', 'Total API requests', ['path', 'method'])

@app.middleware("http")
async def add_process_time_header(request: Request, call_next):
    start_time = time.time()
    response = await call_next(request)
    process_time = time.time() - start_time

    # Update custom metrics
    api_requests.labels(path=request.url.path, method=request.method).inc()

    return response

@app.get("/")
async def root():
    return {"message": "Hello World"}

Configuring Prometheus Server

Basic Prometheus Configuration

Create a prometheus.yml file for your Prometheus server:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'python_app'
    static_configs:
      - targets: ['python-app:5000']  # Assuming your app exposes metrics on port 5000

  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

Running Prometheus with Docker

You can easily run Prometheus using Docker:

# docker-compose.yml
version: '3'
services:
  prometheus:
    image: prom/prometheus
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
    ports:
      - '9090:9090'

  python-app:
    build: .
    ports:
      - '5000:5000'

Setting Up Grafana

Installing Grafana

Add Grafana to your Docker Compose setup:

# Extended docker-compose.yml
services:
  # ... prometheus and python-app ...

  grafana:
    image: grafana/grafana
    ports:
      - '3000:3000'
    volumes:
      - grafana-storage:/var/lib/grafana
    depends_on:
      - prometheus

volumes:
  grafana-storage:

Configuring Prometheus as a Data Source

Once Grafana is running, follow these steps to add Prometheus as a data source:

Access Grafana at http://localhost:3000 (default login: admin/admin)
Navigate to Configuration > Data Sources
Click “Add data source” and select Prometheus
Set the URL to http://prometheus:9090
Click “Save & Test”

Creating Effective Dashboards

Essential Metrics for Python Backends

Here are key metrics you should monitor for any Python backend:

1. Application Performance

Request rate (requests per second)
Error rate (errors per second)
Request duration (latency)
Endpoint-specific metrics

2. Resource Utilization

CPU usage
Memory usage
Garbage collection metrics
Thread/worker counts

3. External Dependencies

Database query time
External API call latency
Cache hit/miss ratio
Queue size and processing time

Sample PromQL Queries

Here are some useful PromQL queries for your dashboards:

# Request rate per endpoint
rate(api_requests_total[1m])

# 95th percentile latency
histogram_quantile(0.95, sum(rate(request_duration_seconds_bucket[5m])) by (le, endpoint))

# Error rate
rate(api_requests_total{http_status=~"5.."}[5m])

# Active requests
active_requests

# CPU and memory usage (if node_exporter is configured)
rate(process_cpu_seconds_total[1m])
process_resident_memory_bytes

Sample Dashboard JSON

Here’s a basic dashboard configuration you can import into Grafana:

{
  "annotations": {
    "list": []
  },
  "editable": true,
  "gnetId": null,
  "graphTooltip": 0,
  "id": 1,
  "links": [],
  "panels": [
    {
      "datasource": "Prometheus",
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": true,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              }
            ]
          },
          "unit": "short"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
        "y": 0
      },
      "id": 2,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.0.6",
      "targets": [
        {
          "expr": "sum(rate(api_requests_total[1m])) by (endpoint)",
          "interval": "",
          "legendFormat": "{{endpoint}}",
          "refId": "A"
        }
      ],
      "title": "Request Rate by Endpoint",
      "type": "timeseries"
    },
    {
      "datasource": "Prometheus",
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": true,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              }
            ]
          },
          "unit": "s"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
        "y": 0
      },
      "id": 4,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.0.6",
      "targets": [
        {
          "expr": "histogram_quantile(0.95, sum(rate(request_duration_seconds_bucket[5m])) by (le, endpoint))",
          "interval": "",
          "legendFormat": "{{endpoint}}",
          "refId": "A"
        }
      ],
      "title": "95th Percentile Response Time",
      "type": "timeseries"
    },
    {
      "datasource": "Prometheus",
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": true,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              }
            ]
          }
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
        "y": 8
      },
      "id": 6,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.0.6",
      "targets": [
        {
          "expr": "sum(active_requests)",
          "interval": "",
          "legendFormat": "Active Requests",
          "refId": "A"
        }
      ],
      "title": "Active Requests",
      "type": "timeseries"
    },
    {
      "datasource": "Prometheus",
      "fieldConfig": {
        "defaults": {
          "color": {
            "mode": "palette-classic"
          },
          "custom": {
            "axisLabel": "",
            "axisPlacement": "auto",
            "barAlignment": 0,
            "drawStyle": "line",
            "fillOpacity": 10,
            "gradientMode": "none",
            "hideFrom": {
              "legend": false,
              "tooltip": false,
              "viz": false
            },
            "lineInterpolation": "linear",
            "lineWidth": 1,
            "pointSize": 5,
            "scaleDistribution": {
              "type": "linear"
            },
            "showPoints": "never",
            "spanNulls": true,
            "stacking": {
              "group": "A",
              "mode": "none"
            },
            "thresholdsStyle": {
              "mode": "off"
            }
          },
          "mappings": [],
          "thresholds": {
            "mode": "absolute",
            "steps": [
              {
                "color": "green",
                "value": null
              }
            ]
          },
          "unit": "short"
        },
        "overrides": []
      },
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
        "y": 8
      },
      "id": 8,
      "options": {
        "legend": {
          "calcs": [],
          "displayMode": "list",
          "placement": "bottom"
        },
        "tooltip": {
          "mode": "single"
        }
      },
      "pluginVersion": "8.0.6",
      "targets": [
        {
          "expr": "sum(rate(api_requests_total{http_status=~\"5..\"}[5m])) by (endpoint)",
          "interval": "",
          "legendFormat": "{{endpoint}}",
          "refId": "A"
        }
      ],
      "title": "Error Rate by Endpoint",
      "type": "timeseries"
    }
  ],
  "refresh": "5s",
  "schemaVersion": 30,
  "style": "dark",
  "tags": [],
  "templating": {
    "list": []
  },
  "time": {
    "from": "now-15m",
    "to": "now"
  },
  "timepicker": {},
  "timezone": "",
  "title": "Python Backend Monitoring",
  "uid": "python-backend",
  "version": 1
}

Setting Up Alerts

Alert Rules in Prometheus

Add alerting rules to your Prometheus configuration:

# prometheus.yml
rule_files:
  - "alert_rules.yml"

# alert_rules.yml
groups:
- name: python_backend_alerts
  rules:
  - alert: HighErrorRate
    expr: sum(rate(api_requests_total{http_status=~"5.."}[5m])) / sum(rate(api_requests_total[5m])) > 0.05
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "High error rate detected"
      description: "Error rate is above 5% (current value: {{ $value }})"

  - alert: SlowResponseTime
    expr: histogram_quantile(0.95, sum(rate(request_duration_seconds_bucket[5m])) by (le)) > 2
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "Slow response time detected"
      description: "95th percentile response time is above 2 seconds (current value: {{ $value }}s)"

Alerting in Grafana

To set up alerts in Grafana, follow these steps:

Edit a panel in your dashboard
Go to the “Alert” tab
Click “Create Alert”
Configure conditions, e.g., “avg() of query(A,5m,now) is above 0.5”
Set evaluation interval (e.g., “Evaluate every 1m”)
Configure notifications (e.g., email, Slack)
Save the dashboard

Best Practices for Production

Performance Considerations

When implementing monitoring in production Python backends, keep these considerations in mind:

Metric Cardinality: Avoid high cardinality labels (like user IDs) that can overload Prometheus
Collection Frequency: Balance between granularity and performance impact
Memory Usage: Monitor the memory usage of your instrumentation to ensure it’s not excessive
Prometheus Storage: Plan for adequate storage and retention based on your metrics volume

Security Best Practices

Secure your monitoring stack with these practices:

Authentication: Secure Prometheus and Grafana with proper authentication
Network Security: Use network segmentation to restrict access to monitoring endpoints
TLS: Enable TLS for all monitoring traffic
Sensitive Data: Avoid exposing sensitive data in metrics or labels

Scalability

As your Python backend grows, consider these scalability approaches:

Federation: Use Prometheus federation for large-scale deployments
Push Gateway: Use Prometheus Push Gateway for batch jobs or ephemeral services
Remote Storage: Implement remote storage solutions for long-term metrics retention
Hierarchical Monitoring: Implement a hierarchical monitoring architecture for large systems

Conclusion

Setting up effective monitoring for Python backends with Prometheus and Grafana provides invaluable insights into your application’s performance and health. By following the steps outlined in this article, you can create a robust monitoring system that helps you maintain high-quality service, quickly troubleshoot issues, and plan for future growth.

Remember that monitoring is not a set-and-forget task but an ongoing process of refinement. As your Python backend evolves, your monitoring needs will change too. Regularly review your metrics, dashboards, and alerts to ensure they continue to provide valuable insights into your application’s behavior.

By investing time in proper monitoring from the beginning, you’ll save countless hours of debugging and firefighting in the future, allowing you to focus on building new features and improving your Python backend services.