Exercises
1. Px-sock-shop — Bug Detection
Setup
- Go to the sock-shop demo application URL.
- Browse the page, click around.
Trigger the Error
- Go to the
Cataloguetab. - Select at least two tags from the left
Filterpanel, for example geek and formal. - Click
Apply. - Notice that no socks appear when two or more filters are selected.
- Click
Clearto clear filters between attempts. - You can repeat this any number of times.
Investigate with Pixie
- Open the Pixie Live UI and log in.
- Select namespace
px-sock-shop - Select the
px/clusterscript and see what the view presents. - Select the
px/namespacescript. Enter the namespace we’re operating in. - Review what is displayed on the dashboard.
- Change the time range to
-15m. - Go into the details of the
px-sock-shop/catalogueservice and see what they contain.- At least one error should be reported. If not, repeat the steps from the “Trigger the Error” section.
- Change the script to
px/http_data - Open the script editor (press Ctrl+E) or click the editor icon in the UI.
- Replace the line:
df.node = df.ctx['node']with
# Access the service name. df.service = df.ctx['service'] # Filter to only catalogue service. df = df[df.service == 'px-sock-shop/catalogue'] # Filter to errors greater or equal to 400. df = df[df.resp_status >= 400] - Execute the script. Errors should be visible. See what is causing the errors.
Database Investigation
- Change the script to
px/mysql_data - Add the following line at row 34:
df = df[df.resp_status == 3] - Run the script and see what caused the error.
Explore what other Pixie scripts have to offer
2. Endpoint Deprecation
We want to decide whether we can deprecate one of our application endpoints. To do this, we must first check:
- whether it is being used
- who is using it
We’ll use the Pixie script editor for this:
- Open the Scratch Pad.
- Open the script editor.
- In the
PxL ScriptandVis Spectabs, paste the contents of the corresponding files from theendpoint-deprecation/service_endpoints_summarydirectory. - Run the script for the
otel-demo-quoteserviceservice. Do you see the requests the service handles? - Let’s see exactly what the requests look like. Load the files from the
service_requestsdirectory. - Now we want to see who is making requests and for what. Load the files from the
service_endpoint_requestsdirectory.
3. Cost Estimation
Load the files from the cost-estimation directory into Scratch Pad. See how the values for estimation are calculated.
4. Redis Usage Discovery
Find which service uses Redis in the default namespace and for what purpose.
5. Why So Slow
In the otel namespace, find the service with the largest response time variance and see in the Flame Graph what it mainly spends execution time on.
Guided Exercises
Service Performance Monitoring
1. Service Graph
- Open the Pixie Live View and select the
px/clusterscript.- View the graph of HTTP traffic between services, with latency, error, and throughput per service.
- Hover over edges for stats; thicker lines indicate more traffic.
- Scroll down to the Services table.
- Latency, error, and throughput rates for all HTTP traffic.
- Click the
LATENCYcolumn title to sort services by latency.- Click vertical quantile lines on the box plot to switch between P50, P90, and P99 latency values.
2. Service Performance
- From the
SERVICEcolumn, click on a service (e.g.,px-sock-shop/front-end) to open thepx/servicescript.- Latency, error, and throughput over time for all HTTP requests.
- Modify
start_timeto change the time window (e.g., -30m, -1h).
- Scroll down to the Sample of Slow Requests table and expand the
REQ_PATHcolumn.
3. Endpoint Performance
- Select
pxbeta/service_endpointsfrom the script drop-down menu. - Set the
serviceargument to the desired service (e.g.,px-sock-shop/catalogue).- Latency, error, and throughput per logical endpoint (wildcards for URL parameters).
- Click on an endpoint in the Endpoints table to see an overview and sample of slow requests.
Request Tracing
1. Full-Body HTTP Requests
- Select the
px/http_data_filteredscript.- Shows recent HTTP requests filtered by service, pod, request path, and response status code.
- Filter by status code: Set
status_codeto500and re-run. - Filter by service: Set
svctopx-sock-shop/cartsand re-run. - Click a table row to view the data in JSON format — scroll to
resp_bodyfor error details.
2. Service Errors
- From the
SVCcolumn, click on thepx-sock-shop/cartsservice name.- Opens the
px/servicescript — shows error rate over time.
- Opens the
- Scroll down to the Inbound Traffic by Requesting Service table.
3. Pod Errors
- Click on the pod name to open the
px/podscript.- Shows HTTP error rate and high-level resource metrics.
Network Monitoring
1. Network Traffic
- Select the
px/net_flow_graphscript. Enter a namespace.- Pan, zoom, and rearrange nodes.
- Grey hexagons = pods; blue circles = remote endpoints.
- Filter by setting the
to_entity_filterargument.
2. DNS Requests
- Select the
px/dns_flow_graphscript. - Sort by
LATENCY_AVGto find highest latency requests.
3. TCP Drops
- Select
bpftrace/tcp_dropsand press RUN.- Hover over edges to see TCP drops between pod pairs.
- Enable Hierarchy View for a different perspective.
Infrastructure Health
1. Resource Usage by Node
- Select the
px/nodesscript.- CPU usage, memory consumption, and network traffic stats.
- Click a node name to open
px/nodefor detailed stats. - Change
groupbyto “pod” to group by pod.
2. Resource Usage by Pod
- Click a pod name to open
px/pod.- High-level HTTP metrics, resource usage, containers, processes.
- Scroll down for the CPU flamegraph.
- Dark blue = K8s metadata, light blue = user space app code, light green = kernel code.
Database Query Profiling
1. MySQL Stats
- Select
px/mysql_stats.- Latency, error, and throughput for all MySQL requests.
- Filter by pod (e.g.,
px-sock-shop/catalogue).
2. Normalized SQL Queries
- Select
px/sql_queries.- Latency per normalized SQL query (constants replaced with
?).
- Latency per normalized SQL query (constants replaced with
- Click a query to see latency per individual parameter.
3. Full Body Requests
- Select
px/mysql_data.- Full request and response bodies.
- Sort by descending latency to find slow queries.