Grafana Alloy
Grafana Alloy is an OpenTelemetry Collector distribution by Grafana Labs that acts as the single collection point for all observability signals in our setup. It replaces the need for separate Prometheus scrapers, Promtail for logs, and standalone OTel Collectors.
Role in the Stack
| Signal | Collection Method | Destination |
|---|---|---|
| Metrics | ServiceMonitor scraping (30s interval) + pod annotation scraping | Prometheus, Mimir |
| Traces | OTLP receiver (gRPC :4317, HTTP :4318) | Tempo |
| Logs | Kubernetes pod log tailing (/var/log/pods/) + OTLP receiver |
Loki |
| Profiles | eBPF kernel sampling (97 Hz) + Pyroscope SDK scraping | Pyroscope |
Key decision: Prometheus is configured as a receiver only (no scraping). All metric collection flows through Alloy.
Deployment
- Type: DaemonSet — one pod per node (required for eBPF access to each node’s kernel)
- Privileged: Yes — required for eBPF profiling
- Capabilities:
SYS_ADMIN,SYS_PTRACE,SYS_RESOURCE,PERFMON,BPF - Resources: 256Mi–1Gi memory, 100m–1000m CPU
- Init container: Sets
perf_event_paranoid=-1for eBPF - UI: Built-in component graph at port 12345
Metrics Collection
Alloy discovers metrics targets through two mechanisms:
1. ServiceMonitor scraping — discovers all ServiceMonitors cluster-wide, resolving endpoints and scraping at 30s intervals. This is the primary mechanism.
2. Pod annotation scraping — fallback for pods without ServiceMonitors. Pods with prometheus.io/scrape: "true" are automatically scraped. Monitoring stack pods (prometheus, alloy, mimir, loki, tempo, pyroscope) are excluded to avoid duplication.
Both paths support native histograms (protobuf scraping).
Remote write targets:
- Prometheus (short-term):
prometheus-and-grafana-kub-prometheus:9090 - Mimir (long-term):
mimir-nginx:80
Trace Collection
- Receives OTLP traces via gRPC (:4317) and HTTP (:4318)
- Adds
k8s.namespace.nameattribute to all spans for Kubernetes context - Forwards to Tempo via OTLP gRPC
Log Collection
Two parallel log streams:
1. Kubernetes pod logs — tails /var/log/pods/ on each node, parses CRI format, maps container labels to Loki labels.
2. OTLP logs — receives structured logs from applications via OTLP, enriches with Kubernetes metadata (namespace, pod, container), maps OpenTelemetry severity to Loki’s detected_level.
Both streams are sent to Loki via native Loki protocol.
Profiling
eBPF profiling (all processes, no instrumentation needed):
- Sample rate: 97 Hz
- Collects both kernel and user-space stacks
- Python-specific profiling enabled
- Covers every process on the node — including services that have no SDK instrumentation
Pyroscope SDK scraping (richer data for instrumented services):
- Discovers pods with annotation
profiles.grafana.com/cpu_scrape: "true" - Scrapes CPU, memory, mutex, block, and goroutine profiles
- Provides language-specific profile types (JFR for Java, pprof for Go, etc.)
Volume Mounts
| Mount | Purpose |
|---|---|
/sys/fs/bpf |
BPF filesystem for pinned maps and programs |
/sys/kernel/debug |
Debugfs for kprobes/uprobes |
/sys/kernel/btf |
BTF type information for CO-RE |
/var/log/pods |
Kubernetes pod logs |
/run/containerd |
Container runtime socket for PID-to-pod mapping |
Integration Points
Applications ──OTLP──→ Alloy ──→ Tempo (traces)
──→ Loki (logs)
──→ Prometheus → Mimir (metrics)
──→ Pyroscope (profiles)
K8s components ──scrape──→ Alloy ──→ Prometheus → Mimir
All processes ──eBPF──→ Alloy ──→ Pyroscope
Alloy is the only component that needs to run privileged — all backends run as regular pods.