Overview

Grafana Pyroscope

Pyroscope logo

Source: https://grafana.com/oss/pyroscope/

Grafana Pyroscope is an open-source continuous profiling platform that enables you to understand resource usage (CPU, memory, etc.) at the code level in production environments with minimal overhead.

Key Features

  • Continuous profiling β€” always-on profiling in production, not just during debugging sessions
  • Multiple profile types β€” CPU, heap/memory, goroutines, mutex contention, block/I/O, off-CPU
  • Multi-language support β€” Go, Java, Python, .NET, Node.js, Rust, C++ (via eBPF)
  • Flame graph visualization β€” intuitive visualization of where time and resources are spent
  • Diff flame graphs β€” compare profiles between deployments to detect regressions
  • Grafana integration β€” native datasource with drill-down from metrics and traces to profiles

Data Flow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Data Sources                       β”‚
β”‚                                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ eBPF     β”‚  β”‚ Pyroscope SDKβ”‚  β”‚ JFR / CORECLRβ”‚  β”‚
β”‚  β”‚ (all     β”‚  β”‚ (Go, Python, β”‚  β”‚ (Java, .NET) β”‚  β”‚
β”‚  β”‚ languages)β”‚  β”‚  Node.js)    β”‚  β”‚              β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚        β”‚               β”‚                 β”‚          β”‚
β”‚        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β”‚                        β–Ό                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚              Grafana Alloy                      β”‚ β”‚
β”‚  β”‚  pyroscope.ebpf β†’ eBPF-based CPU profiling     β”‚ β”‚
β”‚  β”‚  pyroscope.scrape β†’ SDK-based profile scraping β”‚ β”‚
β”‚  β”‚  pyroscope.write β†’ forward to Pyroscope        β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                        β–Ό                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚           Pyroscope Server (:4040)              β”‚ β”‚
β”‚  β”‚  Storage: Azure Blob / S3 / local disk         β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                        β–Ό                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚              Grafana (:3000)                    β”‚ β”‚
β”‚  β”‚  Flame graphs, diff views, trace correlation   β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Internal Architecture

Source: https://grafana.com/docs/pyroscope/latest/reference-pyroscope-architecture/about-grafana-pyroscope-architecture/

Pyroscope has a microservices-based architecture. All components are compiled into a single binary, and the -target parameter controls which component(s) the process runs as. This allows the same binary to operate as a monolith or as individual microservices.

Components

Component Role Stateful
Distributor Receives incoming profiles from clients and routes them to ingesters No
Ingester Writes profiles to local disk, periodically flushes blocks to long-term storage Yes
Compactor Merges blocks from multiple ingesters, removes duplicate samples, reduces storage No
Query-frontend Receives queries, accelerates execution (splitting, caching), dispatches to query-scheduler No
Query-scheduler Maintains a per-tenant query queue, ensures fair scheduling No
Querier Pulls queries from scheduler, fetches data from ingesters (recent) and store-gateways (historical) No
Store-gateway Provides access to blocks in long-term object storage No

The Write Path

                    Profiles from clients
                            β”‚
                            β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Distributor  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚  routes by tenant + series
                            β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β–Ό             β–Ό             β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚ Ingester β”‚  β”‚ Ingester β”‚  β”‚ Ingester β”‚
        β”‚ (replica)β”‚  β”‚ (replica)β”‚  β”‚ (replica)β”‚
        β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
              β”‚              β”‚              β”‚
              β”‚   flush blocks to storage   β”‚
              β–Ό              β–Ό              β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚         Long-Term Object Storage       β”‚
        β”‚    (S3 / Azure Blob / GCS / local)     β”‚
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚   Compactor   β”‚
                     β”‚ merge blocks, β”‚
                     β”‚ deduplicate   β”‚
                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  1. Distributor receives push requests and routes each profile series to ingesters
  2. Each series is replicated to 3 ingesters by default
  3. Ingesters append profiles to a per-tenant database on local disk
  4. In-memory profiles are periodically flushed to disk as blocks
  5. Blocks are uploaded to long-term object storage
  6. Compactor merges blocks from multiple ingesters into single blocks and removes duplicate samples

The Read Path

                    Query from Grafana
                            β”‚
                            β–Ό
                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                  β”‚ Query-frontend   β”‚
                  β”‚ (split, cache)   β”‚
                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                  β”‚ Query-scheduler  β”‚
                  β”‚ (per-tenant      β”‚
                  β”‚  fair queuing)   β”‚
                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                  β”‚    Querier       β”‚
                  β””β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”˜
                      β”‚          β”‚
            recent    β”‚          β”‚  historical
            data      β”‚          β”‚  data
                      β–Ό          β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚ Ingestersβ”‚ β”‚ Store-gateway  β”‚
              β”‚ (memory) β”‚ β”‚ (object store) β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  1. Query-frontend receives the query, splits it by time range, checks the cache
  2. Query-scheduler queues the sub-queries with fair per-tenant scheduling
  3. Querier picks up work and fetches data from:
    • Ingesters β€” for recent, in-memory data
    • Store-gateways β€” for historical data in object storage
  4. Results are merged and returned to Grafana

Deployment Modes

Mode -target Description Use Case
Monolithic all (default) All components in a single process Development, small workloads, quick start
Microservices per component (e.g. ingester) Each component runs as a separate process Production β€” independent scaling, isolated failure domains

πŸ’‘ In this workshop we deploy Pyroscope in monolithic mode (-target=all) as a single replica, which is sufficient for a training environment.

Long-Term Storage

Pyroscope stores each tenant’s profiles in on-disk blocks containing an index, metadata, and Parquet tables. Blocks are uploaded to object storage for durability.

Backend Use Case
Amazon S3 Production (AWS)
Azure Blob Storage Production (Azure β€” used in our setup)
Google Cloud Storage Production (GCP)
OpenStack Swift Production (OpenStack)
Local filesystem Development, single-node only

Collection Methods

Method Languages Overhead Code Changes Description
eBPF All < 1% None Kernel-level sampling via Grafana Alloy
Pyroscope SDK Go, Python, Java, .NET, Node.js 1-5% Minimal In-process profiler with richer data
JFR Java, Kotlin 1-3% None (agent) Java Flight Recorder integration
CORECLR Profiler .NET 1-3% None (agent) .NET CLR profiling
Pyroscope scrape Go (pprof) < 1% Annotation only Pull-based via pod annotations

Profile Types

Type What It Measures When to Use
CPU Time spent executing code High CPU usage, slow endpoints
Heap (Alloc) Currently allocated memory Memory leaks, high RAM
Goroutine / Thread Active threads/goroutines Goroutine leaks, deadlocks
Mutex / Lock Time waiting for locks Lock contention
Block / I/O Time blocked on I/O Slow network/disk ops
Off-CPU Time when thread is not on CPU I/O waits, scheduling

When to Use Pyroscope

  • βœ… Performance optimization β€” find hot spots in production code
  • βœ… Memory leak diagnosis β€” heap profile shows what holds memory
  • βœ… Regression analysis β€” diff flame graphs before and after deployment
  • βœ… Cloud cost reduction β€” identify inefficient code β†’ smaller instances
  • βœ… Latency debugging β€” trace shows slow span, profile shows why
  • ❌ Does not replace traces or metrics β€” it’s a complementary signal

Pyroscope vs Pixie

Both tools use eBPF for profiling but serve fundamentally different purposes. Pyroscope is a dedicated continuous profiling platform with long-term storage. Pixie is a real-time debugging tool that captures full network traffic and keeps data in-memory.

Feature Comparison

Aspect Pyroscope Pixie
Primary purpose Continuous profiling (CPU, memory, locks, I/O) Real-time request tracing + network observability
eBPF usage CPU stack sampling (~97 Hz) Syscall tracing (kprobes), TLS interception (uprobes), CPU sampling (~100 Hz)
Profile types CPU, heap, goroutine, mutex, block, off-CPU CPU only
Protocol tracing No Yes β€” HTTP, gRPC, SQL, Redis, Kafka, DNS, and more
Request body capture No Yes β€” full request/response payloads
Language support (profiling) All (eBPF), richer data for Go, Java, Python, .NET, Node.js (SDK) Compiled languages only (Go, C++, Rust) β€” requires debug symbols
Language support (tracing) N/A β€” Pyroscope does not trace requests Any language β€” eBPF hooks syscalls at kernel level
Data storage Long-term object storage (Azure Blob, S3, GCS) In-cluster memory only
Retention Days to months Minutes to hours
Flame graphs Yes β€” with diff view between deployments Yes β€” CPU only
Grafana integration Native datasource, trace-to-profile linking Grafana datasource plugin with pre-built dashboards
Query language Grafana UI + API PxL (custom scripting language)
Data leaves cluster Yes β€” sent to Pyroscope backend No β€” stays in-cluster
Code changes needed None (eBPF) or minimal (SDK for richer data) None

When to Use Which

Use Pyroscope when:

  • You need to understand why code is slow β€” which functions consume CPU, allocate memory, or contend on locks
  • You want to compare profiles between deployments (diff flame graphs)
  • You need long-term profiling data for trend analysis
  • You want to drill down from a trace span to see the exact code profile for that operation
  • You’re profiling Java, Python, .NET, or Node.js services (SDK support)

Use Pixie when:

  • You need to see full HTTP/gRPC/SQL request bodies without instrumentation
  • You’re debugging network issues β€” TCP drops, DNS failures, retransmits
  • You want instant visibility into a running cluster without deploying any agents to applications
  • You’re investigating database query performance with full SQL capture
  • Data must stay in the cluster for security/compliance reasons

Use both together:

  • Pixie shows you which requests are slow (full request tracing with latency) β†’ Pyroscope shows you why they’re slow (code-level CPU/memory profile)
  • Pixie captures network-level symptoms, Pyroscope reveals application-level root causes

results matching ""

    No results matching ""