Grafana Tempo

Grafana Tempo

Tempo is the distributed tracing backend. It stores traces in object storage with no indexing requirements — search is powered by the columnar vParquet4 format. Tempo also generates metrics from traces, bridging the gap between tracing and monitoring.

Role in the Stack

Function Details
Trace storage Stores spans in Azure Blob Storage (vParquet4 columnar format)
TraceQL engine Query language for filtering and analyzing traces
Metrics generation Extracts RED metrics, service graphs, and TraceQL metrics from spans
Protocol gateway Accepts traces via OTLP, Jaeger, Zipkin, and OpenCensus protocols
MCP server Exposes trace data to AI assistants via Model Context Protocol

Deployment — Microservices Mode

Component Replicas CPU RAM Storage Purpose
Distributor 1 200m 512Mi Entry point, accepts all trace protocols
Ingester 3 500m 2Gi 10Gi PV Buffers spans, writes blocks to object storage
Querier 2 200m 512Mi Retrieves traces from ingesters + storage
Query Frontend 1 200m 512Mi Query optimization, streaming, MCP server
Gateway 1 Nginx reverse proxy
Compactor 1 200m 2Gi Block compaction, retention enforcement
Metrics Generator 1 200m 512Mi Extracts metrics from spans

What Feeds Into Tempo

Source Protocol Path
Alloy OTLP gRPC App OTel SDK → Alloy OTLP receiver → Tempo Distributor

Supported protocols (for direct ingestion):

  • OTLP (4317/4318)
  • Jaeger (14250/14268/6831/6832)
  • Zipkin (9411)
  • OpenCensus (55678)

Storage

  • Backend: Azure Blob Storage
  • Container: tempo-traces
  • Format: vParquet4 (columnar, optimized for TraceQL)
  • Block retention: 24 hours
  • Compacted block retention: 1 hour

Metrics Generator — Traces to Metrics

This is one of Tempo’s most powerful features. The metrics generator processes every ingested span and produces three types of derived data:

1. Span Metrics (RED)

Generates Prometheus-compatible metrics from spans:

  • traces_spanmetrics_latency_bucket — duration histogram per service/operation
  • traces_spanmetrics_calls_total — request count per service/operation with status

These are the foundation of the Traces Drilldown Rate/Errors/Duration signals.

2. Service Graphs

Generates service-to-service dependency metrics:

  • Request rate between services
  • Error rate between services
  • Duration between services

This powers the Service Map (node graph) visualization in Grafana.

3. Local Blocks (TraceQL Metrics)

Generates metrics from TraceQL expressions for the Traces Drilldown breakdown and comparison features. No duration limit on metrics API queries.

All generated metrics are written to Prometheus via remote write at http://prometheus-and-grafana-kub-prometheus.monitoring.svc.cluster.local:9090/api/v1/write, which then forwards them to Mimir for long-term storage.

Integration with Other Components

Traces → Logs (Loki)

  • Links trace spans to logs by injecting TraceID filter
  • Tag mapping: service.name (trace attribute) → service (Loki label)
  • Time shift: ±1 hour around the span timestamp
  • Enables: Click a span → see all logs from that service around that time

Traces → Metrics (Prometheus/Mimir)

  • Span metrics generator produces RED metrics per service/operation
  • Grafana queries these metrics when you click a span’s “Related metrics”
  • Exemplars on metrics link back to specific traces

Traces → Profiles (Pyroscope)

  • Links trace spans to Pyroscope profiles by service_name
  • Enables: Click a span → see the CPU/memory profile of that service at that time
  • Useful for answering “this span was slow — what was the service doing?”

MCP Server

  • Enabled on Query Frontend at /api/mcp
  • Allows AI assistants to query traces programmatically
  • Exposes trace search and TraceQL capabilities

Grafana Datasource

  • Type: tempo
  • URL: http://tempo-query-frontend.monitoring.svc.cluster.local:3200
  • Features enabled:
    • HTTP streaming (for large trace results)
    • Node Graph (service dependency visualization)
    • Service Map from Prometheus (uses span metrics)
    • Traces to Logs (Loki with tag mapping)
    • Traces to Metrics (Prometheus with span metrics queries)
    • Traces to Profiles (Pyroscope with service_name mapping)
    • TraceQL search

results matching ""

    No results matching ""