Prometheus, Grafana & OpenTelemetry — A Complete Guide for Backend Engineers (2025)

Monitoring • Metrics • Data Collection • Kubernetes • Architecture

Modern cloud-native systems must be observable, not just monitored. Observability is powered by:

✅ Metrics
✅ Logs
✅ Traces

This guide explains how Prometheus, Grafana, and OpenTelemetry work — especially in Kubernetes — and how exactly data is collected.

✅ 1. What is Prometheus?

Prometheus is an open-source monitoring system designed for pull-based metric collection.

It provides:

✅ Time-Series Database (TSDB)
✅ Scraping engine
✅ Query engine (PromQL)
✅ Alerting (via Alertmanager)

✅ 1.1 What Prometheus Collects

Prometheus collects:

Infrastructure metrics:

CPU
RAM
Disk IO
Network

Kubernetes metrics:

Pod state
Deployment health
Autoscaling
Node health

Application metrics:

HTTP latency
Request count
Error rate
DB query time
Custom counters/gauges

✅ 1.2 Prometheus Architecture (Improved Diagram + Data Flow Included)


                                      ┌──────────────────────────┐
                                      │        Grafana           │
                                      │   Dashboards & Alerts    │
                                      └───────────▲──────────────┘
                                                  │
                                        PromQL Queries (read)
                                                  │
                   ┌──────────────────────────────┴─────────────────────────────┐
                   │                        Prometheus                          │
                   │   ┌────────────────┬───────────────────┬─────────────────┐ │
                   │   │  Scraping      │   TSDB (Storage)  │ Alerting        │ │
                   │   │ Engine (pull)  │   Time-Series DB  │ Alertmanager    │ │
                   │   └───────┬────────┴───────────────────┴─────────────────┘ │
                   └───────────┼────────────────────────────────────────────────┘
                               │
    ┌──────────────────────────┼───────────────────────────────────────────┐
    │                          │                                           │
    ▼                          ▼                                           ▼
┌───────────┐         ┌───────────────────┐                      ┌──────────────────┐
│ App / API │         │ Node Exporter     │                      │ kube-state-metrics│
│  /metrics │         │ (CPU, RAM, Disk)  │                      │ (K8s objects)    │
└───────────┘         └───────────────────┘                      └──────────────────┘
      ▲
      │
      └────── Application exposes metrics (Prometheus format)

✅ How exactly does data collection happen? (MOST IMPORTANT SECTION)

Prometheus is pull-based, meaning:

✅ Prometheus DOES NOT read logs or intercept requests.

✅ Prometheus DOES NOT collect metrics “when you hit an API endpoint”.

Instead:

✅ Your application exposes metrics at a special URL (like `/metrics` or `/actuator/prometheus`)

✅ Prometheus periodically calls (scrapes) that endpoint

✅ The endpoint returns current counters, histograms, gauges

✅ Prometheus stores them in its time-series database

✅ Example Number-by-Number Flow

You hit your API:


GET /hello

Inside your Spring Boot / Go / Python app:

✅ A counter metric increases internally:


http_server_requests_seconds_count{status="200",method="GET",uri="/hello"} += 1

But this does not go to Prometheus yet.

🚀 Data is exported to Prometheus ONLY when Prometheus scrapes your `/metrics` endpoint.

Scraping flow:


Prometheus → GET http://yourapp:8080/actuator/prometheus

Your app responds with:


http_server_requests_seconds_count{method="GET",uri="/hello"} 42
http_server_requests_seconds_sum{method="GET",uri="/hello"} 3.9

Prometheus then:

✅ Parses the result
✅ Stores the numeric values in TSDB
✅ Grafana reads them later

✅ Visual Flow (Ideal for your blog)


User Request
   │
   ▼
Application Endpoint (/hello)
   │
   ├── app updates counters/gauges/histograms in memory
   ▼
/metrics endpoint (generated by Micrometer / Prometheus SDK)
   │
   ▼
Prometheus (pulls data every X seconds)
   │
   ▼
Time-Series Storage (TSDB)
   │
   ▼
Grafana (visualizes using PromQL)

✅ 2. What is Grafana?

Grafana is a:

✅ Visualization tool
✅ Dashboarding system
✅ Alerting platform
✅ Query interface for Prometheus, Loki, Tempo

Grafana does not store metrics.
It only reads metrics from Prometheus.

✅ 3. What is OpenTelemetry (OTel)?

OpenTelemetry is the modern standard for:

✅ Metrics
✅ Logs
✅ Traces

Prometheus = metrics only
Grafana = visualization
OpenTelemetry = full telemetry pipeline

✅ 4. How It All Works in Kubernetes

Prometheus auto-discovers:

✅ pods
✅ services
✅ endpoints
✅ kubelets

Each application exposes /metrics
Prometheus scrapes them
Grafana visualizes
Alertmanager triggers alerts
OTel Collector sends metrics/logs/traces downstream

✅ ✅ Summary


Prometheus collects metrics USING SCRAPING (pull model)
Applications expose metrics at /metrics
Prometheus stores metrics in TSDB
Grafana visualizes the metrics
OpenTelemetry adds logs & traces
Kubernetes exposes metrics via exporters and service discovery

The Backend Engineer’s Journal