Prometheus, Grafana & OpenTelemetry — A Complete Guide for Backend Engineers (2025)
Monitoring • Metrics • Data Collection • Kubernetes • Architecture
Modern cloud-native systems must be observable, not just monitored. Observability is powered by:
✅ Metrics
✅ Logs
✅ Traces
This guide explains how Prometheus, Grafana, and OpenTelemetry work — especially in Kubernetes — and how exactly data is collected.
✅ 1. What is Prometheus?
Prometheus is an open-source monitoring system designed for pull-based metric collection.
It provides:
✅ Time-Series Database (TSDB)
✅ Scraping engine
✅ Query engine (PromQL)
✅ Alerting (via Alertmanager)
✅ 1.1 What Prometheus Collects
Prometheus collects:
Infrastructure metrics:
-
CPU
-
RAM
-
Disk IO
-
Network
Kubernetes metrics:
-
Pod state
-
Deployment health
-
Autoscaling
-
Node health
Application metrics:
-
HTTP latency
-
Request count
-
Error rate
-
DB query time
-
Custom counters/gauges
✅ 1.2 Prometheus Architecture (Improved Diagram + Data Flow Included)
✅ How exactly does data collection happen? (MOST IMPORTANT SECTION)
Prometheus is pull-based, meaning:
✅ Prometheus DOES NOT read logs or intercept requests.
✅ Prometheus DOES NOT collect metrics “when you hit an API endpoint”.
Instead:
✅ Your application exposes metrics at a special URL (like /metrics or /actuator/prometheus)
✅ Prometheus periodically calls (scrapes) that endpoint
✅ The endpoint returns current counters, histograms, gauges
✅ Prometheus stores them in its time-series database
✅ Example Number-by-Number Flow
You hit your API:
Inside your Spring Boot / Go / Python app:
✅ A counter metric increases internally:
But this does not go to Prometheus yet.
🚀 Data is exported to Prometheus ONLY when Prometheus scrapes your /metrics endpoint.
Scraping flow:
Your app responds with:
Prometheus then:
✅ Parses the result
✅ Stores the numeric values in TSDB
✅ Grafana reads them later
✅ Visual Flow (Ideal for your blog)
✅ 2. What is Grafana?
Grafana is a:
✅ Visualization tool
✅ Dashboarding system
✅ Alerting platform
✅ Query interface for Prometheus, Loki, Tempo
Grafana does not store metrics.
It only reads metrics from Prometheus.
✅ 3. What is OpenTelemetry (OTel)?
OpenTelemetry is the modern standard for:
✅ Metrics
✅ Logs
✅ Traces
Prometheus = metrics only
Grafana = visualization
OpenTelemetry = full telemetry pipeline
✅ 4. How It All Works in Kubernetes
Prometheus auto-discovers:
✅ pods
✅ services
✅ endpoints
✅ kubelets
Each application exposes /metrics
Prometheus scrapes them
Grafana visualizes
Alertmanager triggers alerts
OTel Collector sends metrics/logs/traces downstream
✅ ✅ Summary
No comments:
Post a Comment