Overview

Grafana Pyroscope

Source: https://grafana.com/oss/pyroscope/

Grafana Pyroscope is an open-source continuous profiling platform that enables you to understand resource usage (CPU, memory, etc.) at the code level in production environments with minimal overhead.

Key Features

Continuous profiling — always-on profiling in production, not just during debugging sessions
Multiple profile types — CPU, heap/memory, goroutines, mutex contention, block/I/O, off-CPU
Multi-language support — Go, Java, Python, .NET, Node.js, Rust, C++ (via eBPF)
Flame graph visualization — intuitive visualization of where time and resources are spent
Diff flame graphs — compare profiles between deployments to detect regressions
Grafana integration — native datasource with drill-down from metrics and traces to profiles

Data Flow

┌─────────────────────────────────────────────────────┐
│                   Data Sources                       │
│                                                      │
│  ┌──────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │ eBPF     │  │ Pyroscope SDK│  │ JFR / CORECLR│  │
│  │ (all     │  │ (Go, Python, │  │ (Java, .NET) │  │
│  │ languages)│  │  Node.js)    │  │              │  │
│  └─────┬────┘  └──────┬───────┘  └──────┬───────┘  │
│        │               │                 │          │
│        └───────────────┼─────────────────┘          │
│                        ▼                             │
│  ┌────────────────────────────────────────────────┐ │
│  │              Grafana Alloy                      │ │
│  │  pyroscope.ebpf → eBPF-based CPU profiling     │ │
│  │  pyroscope.scrape → SDK-based profile scraping │ │
│  │  pyroscope.write → forward to Pyroscope        │ │
│  └────────────────────┬───────────────────────────┘ │
│                        ▼                             │
│  ┌────────────────────────────────────────────────┐ │
│  │           Pyroscope Server (:4040)              │ │
│  │  Storage: Azure Blob / S3 / local disk         │ │
│  └────────────────────────────────────────────────┘ │
│                        ▼                             │
│  ┌────────────────────────────────────────────────┐ │
│  │              Grafana (:3000)                    │ │
│  │  Flame graphs, diff views, trace correlation   │ │
│  └────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘

Internal Architecture

Source: https://grafana.com/docs/pyroscope/latest/reference-pyroscope-architecture/about-grafana-pyroscope-architecture/

Pyroscope has a microservices-based architecture. All components are compiled into a single binary, and the -target parameter controls which component(s) the process runs as. This allows the same binary to operate as a monolith or as individual microservices.

Components

Component	Role	Stateful
Distributor	Receives incoming profiles from clients and routes them to ingesters	No
Ingester	Writes profiles to local disk, periodically flushes blocks to long-term storage	Yes
Compactor	Merges blocks from multiple ingesters, removes duplicate samples, reduces storage	No
Query-frontend	Receives queries, accelerates execution (splitting, caching), dispatches to query-scheduler	No
Query-scheduler	Maintains a per-tenant query queue, ensures fair scheduling	No
Querier	Pulls queries from scheduler, fetches data from ingesters (recent) and store-gateways (historical)	No
Store-gateway	Provides access to blocks in long-term object storage	No

The Write Path

                    Profiles from clients
                            │
                            ▼
                    ┌───────────────┐
                    │  Distributor  │
                    └───────┬───────┘
                            │  routes by tenant + series
                            ▼
              ┌─────────────┼─────────────┐
              ▼             ▼             ▼
        ┌──────────┐  ┌──────────┐  ┌──────────┐
        │ Ingester │  │ Ingester │  │ Ingester │
        │ (replica)│  │ (replica)│  │ (replica)│
        └─────┬────┘  └─────┬────┘  └─────┬────┘
              │              │              │
              │   flush blocks to storage   │
              ▼              ▼              ▼
        ┌────────────────────────────────────────┐
        │         Long-Term Object Storage       │
        │    (S3 / Azure Blob / GCS / local)     │
        └────────────────────┬───────────────────┘
                             │
                             ▼
                     ┌───────────────┐
                     │   Compactor   │
                     │ merge blocks, │
                     │ deduplicate   │
                     └───────────────┘

Distributor receives push requests and routes each profile series to ingesters
Each series is replicated to 3 ingesters by default
Ingesters append profiles to a per-tenant database on local disk
In-memory profiles are periodically flushed to disk as blocks
Blocks are uploaded to long-term object storage
Compactor merges blocks from multiple ingesters into single blocks and removes duplicate samples

The Read Path

                    Query from Grafana
                            │
                            ▼
                  ┌──────────────────┐
                  │ Query-frontend   │
                  │ (split, cache)   │
                  └────────┬─────────┘
                           │
                           ▼
                  ┌──────────────────┐
                  │ Query-scheduler  │
                  │ (per-tenant      │
                  │  fair queuing)   │
                  └────────┬─────────┘
                           │
                           ▼
                  ┌──────────────────┐
                  │    Querier       │
                  └───┬──────────┬───┘
                      │          │
            recent    │          │  historical
            data      │          │  data
                      ▼          ▼
              ┌──────────┐ ┌───────────────┐
              │ Ingesters│ │ Store-gateway  │
              │ (memory) │ │ (object store) │
              └──────────┘ └───────────────┘

Query-frontend receives the query, splits it by time range, checks the cache
Query-scheduler queues the sub-queries with fair per-tenant scheduling
Querier picks up work and fetches data from:
- Ingesters — for recent, in-memory data
- Store-gateways — for historical data in object storage
Results are merged and returned to Grafana

Deployment Modes

Mode	`-target`	Description	Use Case
Monolithic	`all` (default)	All components in a single process	Development, small workloads, quick start
Microservices	per component (e.g. `ingester`)	Each component runs as a separate process	Production — independent scaling, isolated failure domains

💡 In this workshop we deploy Pyroscope in monolithic mode (-target=all) as a single replica, which is sufficient for a training environment.

Long-Term Storage

Pyroscope stores each tenant’s profiles in on-disk blocks containing an index, metadata, and Parquet tables. Blocks are uploaded to object storage for durability.

Backend	Use Case
Amazon S3	Production (AWS)
Azure Blob Storage	Production (Azure — used in our setup)
Google Cloud Storage	Production (GCP)
OpenStack Swift	Production (OpenStack)
Local filesystem	Development, single-node only

Collection Methods

Method	Languages	Overhead	Code Changes	Description
eBPF	All	< 1%	None	Kernel-level sampling via Grafana Alloy
Pyroscope SDK	Go, Python, Java, .NET, Node.js	1-5%	Minimal	In-process profiler with richer data
JFR	Java, Kotlin	1-3%	None (agent)	Java Flight Recorder integration
CORECLR Profiler	.NET	1-3%	None (agent)	.NET CLR profiling
Pyroscope scrape	Go (pprof)	< 1%	Annotation only	Pull-based via pod annotations

Profile Types

Type	What It Measures	When to Use
CPU	Time spent executing code	High CPU usage, slow endpoints
Heap (Alloc)	Currently allocated memory	Memory leaks, high RAM
Goroutine / Thread	Active threads/goroutines	Goroutine leaks, deadlocks
Mutex / Lock	Time waiting for locks	Lock contention
Block / I/O	Time blocked on I/O	Slow network/disk ops
Off-CPU	Time when thread is not on CPU	I/O waits, scheduling

When to Use Pyroscope

✅ Performance optimization — find hot spots in production code
✅ Memory leak diagnosis — heap profile shows what holds memory
✅ Regression analysis — diff flame graphs before and after deployment
✅ Cloud cost reduction — identify inefficient code → smaller instances
✅ Latency debugging — trace shows slow span, profile shows why
❌ Does not replace traces or metrics — it’s a complementary signal

Pyroscope vs Pixie

Both tools use eBPF for profiling but serve fundamentally different purposes. Pyroscope is a dedicated continuous profiling platform with long-term storage. Pixie is a real-time debugging tool that captures full network traffic and keeps data in-memory.

Feature Comparison

Aspect	Pyroscope	Pixie
Primary purpose	Continuous profiling (CPU, memory, locks, I/O)	Real-time request tracing + network observability
eBPF usage	CPU stack sampling (~97 Hz)	Syscall tracing (kprobes), TLS interception (uprobes), CPU sampling (~100 Hz)
Profile types	CPU, heap, goroutine, mutex, block, off-CPU	CPU only
Protocol tracing	No	Yes — HTTP, gRPC, SQL, Redis, Kafka, DNS, and more
Request body capture	No	Yes — full request/response payloads
Language support (profiling)	All (eBPF), richer data for Go, Java, Python, .NET, Node.js (SDK)	Compiled languages only (Go, C++, Rust) — requires debug symbols
Language support (tracing)	N/A — Pyroscope does not trace requests	Any language — eBPF hooks syscalls at kernel level
Data storage	Long-term object storage (Azure Blob, S3, GCS)	In-cluster memory only
Retention	Days to months	Minutes to hours
Flame graphs	Yes — with diff view between deployments	Yes — CPU only
Grafana integration	Native datasource, trace-to-profile linking	Grafana datasource plugin with pre-built dashboards
Query language	Grafana UI + API	PxL (custom scripting language)
Data leaves cluster	Yes — sent to Pyroscope backend	No — stays in-cluster
Code changes needed	None (eBPF) or minimal (SDK for richer data)	None

When to Use Which

Use Pyroscope when:

You need to understand why code is slow — which functions consume CPU, allocate memory, or contend on locks
You want to compare profiles between deployments (diff flame graphs)
You need long-term profiling data for trend analysis
You want to drill down from a trace span to see the exact code profile for that operation
You’re profiling Java, Python, .NET, or Node.js services (SDK support)

Use Pixie when:

You need to see full HTTP/gRPC/SQL request bodies without instrumentation
You’re debugging network issues — TCP drops, DNS failures, retransmits
You want instant visibility into a running cluster without deploying any agents to applications
You’re investigating database query performance with full SQL capture
Data must stay in the cluster for security/compliance reasons

Use both together:

Pixie shows you which requests are slow (full request tracing with latency) → Pyroscope shows you why they’re slow (code-level CPU/memory profile)
Pixie captures network-level symptoms, Pyroscope reveals application-level root causes

Overview

Overview

Grafana Pyroscope

Key Features

Data Flow

Internal Architecture

Components

The Write Path

The Read Path

Deployment Modes

Long-Term Storage

Collection Methods

Profile Types

When to Use Pyroscope

Pyroscope vs Pixie

Feature Comparison

When to Use Which

results matching ""

No results matching ""