Profiles (eBPF)

πŸ”¬ 4th Pillar of Observability - Profiles (eBPF)

What are they for?

Understanding what exactly your code does at runtime β€” CPU, memory, allocations, locks

πŸ“ What are profiles?

Profiles are the fourth telemetry signal in OpenTelemetry (alongside logs, metrics, and traces), providing detailed information about how an application uses system resources in real time.

Key characteristics:

  • Show which functions/lines of code consume CPU, memory, or other resources
  • Enable continuous profiling β€” continuous collection of profiling data in production (not just during debugging)
  • In OpenTelemetry: data model stable (since 2024), SDK implementation in progress

What a profile contains:

Stack trace sample (CPU profile):
──────────────────────────────────────
main.handleRequest()                    ← 45% CPU
  └── db.QueryContext()                 ← 30% CPU
       └── net/http.(*conn).readRequest ← 10% CPU
  └── json.Marshal()                    ← 5% CPU

Each sample contains:

  • Stack trace β€” full function call path
  • Value β€” how much of the resource was consumed (CPU cycles, memory bytes, allocation count)
  • Labels β€” context (service name, environment, etc.)
  • Timestamp β€” when the sample was collected

🐝 eBPF β€” The Foundation of Modern Profiling

eBPF (extended Berkeley Packet Filter) is a Linux kernel technology that allows running sandboxed programs in kernel space without modifying the kernel source code or loading modules.

Why eBPF is crucial for profiling:

Aspect Traditional Profiling eBPF Profiling
Overhead 5-20% (e.g., Java Flight Recorder) < 1%
Code changes required Yes (agent/library) No β€” operates at kernel level
Languages Language-specific Any language (observes syscalls)
Security Agent in process Sandboxed in kernel
Visibility User-space only User-space + kernel-space

How eBPF profiling works:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   User Space                     β”‚
β”‚                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚  β”‚ Service A β”‚  β”‚ Service B β”‚  β”‚ Service C β”‚      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚                                                  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                  Kernel Space                    β”‚
β”‚                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚          eBPF Programs (sandboxed)          β”‚ β”‚
β”‚  β”‚                                            β”‚ β”‚
β”‚  β”‚  β€’ perf_event β†’ collects stack traces      β”‚ β”‚
β”‚  β”‚  β€’ kprobe    β†’ intercepts syscalls         β”‚ β”‚
β”‚  β”‚  β€’ uprobe    β†’ hooks user-space functions  β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                     β”‚                            β”‚
β”‚              eBPF Maps (ring buffer)             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚
                      β–Ό
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
          β”‚  Profiling Agent      β”‚
          β”‚  (Pyroscope/Parca)    β”‚
          β”‚  β†’ aggregation        β”‚
          β”‚  β†’ symbolization      β”‚
          β”‚  β†’ export to backend  β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key eBPF mechanisms:

  • perf_event β€” periodic stack trace sampling (e.g., every 10ms) β†’ CPU profile
  • kprobe / kretprobe β€” hooking kernel functions (e.g., memory allocations, I/O operations)
  • uprobe / uretprobe β€” hooking user-space functions without restarting the process
  • eBPF Maps β€” shared memory kernel ↔ user-space for passing samples

πŸ”₯ Flame Graphs

Profiles are most commonly visualized as flame graphs:

How to read a flame graph:

  • Wide blocks at the top β†’ functions that themselves consume many resources (hot spots)
  • Wide blocks at the bottom β†’ functions that call expensive subtrees
  • Flame graph comparison (diff) β†’ what changed between deployment versions

πŸ“Š Profile Types

Profile Type What it measures Unit When to use
CPU Time spent executing code nanoseconds / cycles High CPU usage, slow endpoints
Heap (Alloc) Currently allocated memory bytes Memory leaks, high RAM usage
Goroutine / Thread Number of active threads/goroutines count Goroutine leaks, deadlocks
Mutex / Lock Time spent waiting for locks nanoseconds Contention, slow concurrency
Block / I/O Time blocked on I/O operations nanoseconds Slow network/disk operations
Off-CPU Time when thread is NOT on CPU nanoseconds Waiting for I/O, scheduling

πŸ› οΈ Continuous Profiling Tools

# Example Pyroscope configuration with Grafana Alloy (eBPF)
pyroscope.ebpf "instance" {
  forward_to     = [pyroscope.write.endpoint.receiver]
  targets_only   = false
  default_target = {"service_name" = "unspecified"}
  demangle       = "none"
  sample_rate    = 97   // Hz - samples per second
}

pyroscope.write "endpoint" {
  endpoint {
    url = "http://pyroscope:4040"
  }
}

Parca (open-source, CNCF sandbox)

# parca-agent as DaemonSet in Kubernetes
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: parca-agent
spec:
  template:
    spec:
      containers:
        - name: parca-agent
          image: ghcr.io/parca-dev/parca-agent
          securityContext:
            privileged: true   # required for eBPF
          args:
            - /bin/parca-agent
            - --node=$(NODE_NAME)
            - --store-address=parca-server:7070

OpenTelemetry Profiling (in development)

OTel Profiling Pipeline:

  Application / eBPF Agent
         β”‚
         β–Ό
  OTel Collector
  (profilesreceiver)       ← new receiver for profiles
         β”‚
         β–Ό
  Backend (Pyroscope / Parca / Datadog / Elastic)

πŸ”— Correlating Profiles with Other Signals

The greatest value of profiles emerges when they are correlated with other signals:

πŸ“ˆ Metric: CPU usage spike β†’ 95%
    β”‚
    β”œβ”€β”€ πŸ” Trace: GET /api/reports (span: 12.5s)
    β”‚       β”‚
    β”‚       └── πŸ”¬ Profile: json.Marshal() β†’ 78% CPU in this span
    β”‚                  └── Conclusion: serialization of a large object
    β”‚
    └── πŸͺ΅ Log: "Report generation completed" (duration: 12.5s)

How it works in practice:

  • Span β†’ Profile: OpenTelemetry links span_id with profile samples β†’ click on a slow span and see which functions are slowing it down
  • Metric β†’ Profile: Grafana allows navigating from a metrics dashboard to a flame graph from the same time period
  • Profile β†’ Log: Flame graph points to a function β†’ log shows what happened inside it

⚑ When to use profiles?

  • βœ… Performance optimization β€” finding hot spots in production code
  • βœ… Memory leak diagnosis β€” heap profile shows what’s holding memory
  • βœ… Regression analysis β€” comparing profiles before and after deployment (diff flame graph)
  • βœ… Cloud cost reduction β€” identifying inefficient code β†’ smaller instances
  • βœ… Latency debugging β€” when a trace shows a slow span, the profile shows why
  • ❌ Does not replace traces or metrics β€” it’s a complementary signal

πŸ’‘ Tip: Start with a CPU profile using eBPF (zero code changes, < 1% overhead), then add heap/goroutine profiles for specific problems.

πŸ”— Four Pillars Together

Logs     β†’ What happened
Metrics  β†’ How often / how long
Traces   β†’ Where and why
Profiles β†’ Why so slow / what consumes resources

➑️ Together they provide a complete picture of system behavior

results matching ""

    No results matching ""