Flame Graphs

Reading Flame Graphs

Flame graphs are the primary visualization for profile data. Understanding how to read them is essential for effective profiling.

Structure

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    main.handleRequest()                   β”‚  ← root (widest)
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚      db.QueryContext()       β”‚     json.Marshal()        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€                          β”‚
β”‚ net.Read()   β”‚ sql.Prepare() β”‚                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                     ↑ top (self time)
  • Width = proportion of total samples (time/resources)
  • Y-axis = call stack depth (root at bottom, leaf at top)
  • Color = typically indicates package/module (varies by tool)

How to Read

  1. Wide blocks at the top β†’ functions that themselves consume many resources (hot spots)
  2. Wide blocks at the bottom β†’ functions that call expensive subtrees
  3. Narrow blocks β†’ functions that contribute little to total resource usage
  4. Self time vs total time: A function may have high total time (it calls expensive children) but low self time (it doesn’t do much work itself)

Key Patterns

CPU Hot Spot

main.ServeHTTP()                     ← 100% total, 2% self
  └── handler.ProcessRequest()       ← 95% total, 5% self
       β”œβ”€β”€ json.Marshal()            ← 60% total, 60% self  ← HOT SPOT
       └── db.Query()                ← 30% total, 3% self
            └── net.Read()           ← 27% total, 27% self

Action: Optimize json.Marshal() β€” perhaps use a faster serializer or reduce payload size.

Memory Leak Pattern

heap profile β€” growing over time:
main.handleRequest()
  └── cache.Store()
       └── make([]byte, largeSize)   ← allocations never freed

Action: Check if the cache has eviction logic.

Lock Contention

mutex profile:
main.handleRequest()
  └── sync.(*Mutex).Lock()           ← 80% of mutex wait time
       └── cache.(*Cache).Get()      ← shared cache with single lock

Action: Use a sharded cache or sync.RWMutex.

Diff Flame Graphs

Compare two profiles (e.g., before and after a deployment) to find regressions:

  • Red = functions that got slower (more samples in the new profile)
  • Green/Blue = functions that got faster (fewer samples)
  • Grey = unchanged

This is essential for:

  • Detecting performance regressions after deployments
  • Validating optimization efforts
  • Understanding the impact of code changes

Tips

  • Start with CPU profiles (most common performance issues)
  • Look for unexpectedly wide blocks β€” they indicate where time is actually spent
  • Compare profiles before and after changes using diff view
  • Use time range selection in Grafana to focus on specific incidents
  • Filter by service name to isolate individual services
  • eBPF profiles include both kernel and user-space β€” kernel time (e.g., sys_write) often reveals I/O bottlenecks

results matching ""

    No results matching ""