Prometheus, Grafana & OpenTelemetry — A Complete Guide for Backend Engineers (2025)

Prometheus, Grafana & OpenTelemetry — A Complete Guide for Backend Engineers (2025)

Monitoring • Metrics • Data Collection • Kubernetes • Architecture

Modern cloud-native systems must be observable, not just monitored. Observability is powered by:

Metrics
Logs
Traces

This guide explains how Prometheus, Grafana, and OpenTelemetry work — especially in Kubernetes — and how exactly data is collected.


✅ 1. What is Prometheus?

Prometheus is an open-source monitoring system designed for pull-based metric collection.

It provides:

✅ Time-Series Database (TSDB)
✅ Scraping engine
✅ Query engine (PromQL)
✅ Alerting (via Alertmanager)


✅ 1.1 What Prometheus Collects

Prometheus collects:

Infrastructure metrics:

  • CPU

  • RAM

  • Disk IO

  • Network

Kubernetes metrics:

  • Pod state

  • Deployment health

  • Autoscaling

  • Node health

Application metrics:

  • HTTP latency

  • Request count

  • Error rate

  • DB query time

  • Custom counters/gauges


✅ 1.2 Prometheus Architecture (Improved Diagram + Data Flow Included)

┌──────────────────────────┐ │ Grafana │ │ Dashboards & Alerts │ └───────────▲──────────────┘ │ PromQL Queries (read) │ ┌──────────────────────────────┴─────────────────────────────┐ │ Prometheus │ │ ┌────────────────┬───────────────────┬─────────────────┐ │ │ │ Scraping │ TSDB (Storage) │ Alerting │ │ │ │ Engine (pull) │ Time-Series DB │ Alertmanager │ │ │ └───────┬────────┴───────────────────┴─────────────────┘ │ └───────────┼────────────────────────────────────────────────┘ │ ┌──────────────────────────┼───────────────────────────────────────────┐ │ │ │ ▼ ▼ ▼ ┌───────────┐ ┌───────────────────┐ ┌──────────────────┐ │ App / API │ │ Node Exporter │ │ kube-state-metrics│ │ /metrics │ │ (CPU, RAM, Disk) │ │ (K8s objects) │ └───────────┘ └───────────────────┘ └──────────────────┘ ▲ │ └────── Application exposes metrics (Prometheus format)

How exactly does data collection happen? (MOST IMPORTANT SECTION)

Prometheus is pull-based, meaning:

✅ Prometheus DOES NOT read logs or intercept requests.

✅ Prometheus DOES NOT collect metrics “when you hit an API endpoint”.

Instead:

✅ Your application exposes metrics at a special URL (like /metrics or /actuator/prometheus)

✅ Prometheus periodically calls (scrapes) that endpoint

✅ The endpoint returns current counters, histograms, gauges

✅ Prometheus stores them in its time-series database


Example Number-by-Number Flow

You hit your API:

GET /hello

Inside your Spring Boot / Go / Python app:

✅ A counter metric increases internally:

http_server_requests_seconds_count{status="200",method="GET",uri="/hello"} += 1

But this does not go to Prometheus yet.

🚀 Data is exported to Prometheus ONLY when Prometheus scrapes your /metrics endpoint.

Scraping flow:

Prometheus → GET http://yourapp:8080/actuator/prometheus

Your app responds with:

http_server_requests_seconds_count{method="GET",uri="/hello"} 42 http_server_requests_seconds_sum{method="GET",uri="/hello"} 3.9

Prometheus then:

✅ Parses the result
✅ Stores the numeric values in TSDB
✅ Grafana reads them later


✅ Visual Flow (Ideal for your blog)

User Request │ ▼ Application Endpoint (/hello) │ ├── app updates counters/gauges/histograms in memory ▼ /metrics endpoint (generated by Micrometer / Prometheus SDK) │ ▼ Prometheus (pulls data every X seconds) │ ▼ Time-Series Storage (TSDB) │ ▼ Grafana (visualizes using PromQL)

✅ 2. What is Grafana?

Grafana is a:

✅ Visualization tool
✅ Dashboarding system
✅ Alerting platform
✅ Query interface for Prometheus, Loki, Tempo

Grafana does not store metrics.
It only reads metrics from Prometheus.


✅ 3. What is OpenTelemetry (OTel)?

OpenTelemetry is the modern standard for:

✅ Metrics
✅ Logs
✅ Traces

Prometheus = metrics only
Grafana = visualization
OpenTelemetry = full telemetry pipeline


✅ 4. How It All Works in Kubernetes

Prometheus auto-discovers:

✅ pods
✅ services
✅ endpoints
✅ kubelets

Each application exposes /metrics
Prometheus scrapes them
Grafana visualizes
Alertmanager triggers alerts
OTel Collector sends metrics/logs/traces downstream


✅ ✅ Summary

Prometheus collects metrics USING SCRAPING (pull model) Applications expose metrics at /metrics Prometheus stores metrics in TSDB Grafana visualizes the metrics OpenTelemetry adds logs & traces Kubernetes exposes metrics via exporters and service discovery


No comments:

Post a Comment

Model Context Protocol (MCP) — Complete Guide for Backend Engineers

  Model Context Protocol (MCP) — Complete Guide for Backend Engineers Build Tools, Resources, and AI-Driven Services Using LangChain Moder...

Featured Posts