Labels
Labels
Labels are key-value pairs that organize log entries into streams. Unlike traditional systems, Loki does not index log contents — it only indexes label combinations that identify streams, and then searches logs within selected streams.
Each unique label combination = separate stream = separate set of chunks.
Label service_name
Loki automatically tries to set the service_name label, looking in order for: service_name, service, app, application, name, app_kubernetes_io_name, container, container_name, component, workload, job. If none exists — it assigns unknown_service.
This label powers Grafana features such as Logs Drilldown and Application Observability.
Static vs Dynamic Labels
Static Labels
- Predefined, constant values — e.g.,
environment,cluster,region,namespace,app - Limited, predictable number of values
- Recommended as the primary labeling strategy
- Example:
{env="prod", cluster="eu-west", app="frontend"}
Dynamic Labels
- Extracted from log content (e.g., via regex in pipelines)
- Can create many streams if they have many unique values
- Example: extracting
actionandstatus_codefrom Apache logs — creates separate streams for GET/POST × 200/400/500 - Use with caution — each new value = new stream
Cardinality
Cardinality is the number of unique label combinations and their values — it directly impacts the number of streams.
Why is high cardinality a problem?
- Large indexes with many chunks → slower queries, more I/O operations
- Many open chunks → high RAM cost on Ingesters
- Example: adding a
user_idlabel with 100k values = 100k streams × retention = cost explosion
Approximate thresholds
- Dynamic labels should have a few dozen values at most
- Below 100,000 active streams per tenant
- For large tenants (10+ TB/day) — below one million streams within 24h
- Smaller tenants should maintain proportionally fewer
Diagnostic tool
logcli --analyze-labels
Helps identify problematic labels with high cardinality.
Best Practices
What should be a label?
- Regions, clusters, servers
- Applications, namespaces
- Environments (dev/staging/prod)
- In general: things that describe the source of logs, not their content
What should NOT be a label?
- Log level, message content, exception names — search with filters within the stream
- Trace ID, request ID, order ID — ephemeral values
- User ID, customer ID — high cardinality
- IP addresses, timestamps — unlimited number of values
Filter is often better than a label
Surprisingly, content filters are often just as fast as additional labels, and they don’t fragment logs into multiple streams:
# ❌ Instead of this (creates additional streams):
{app="loki", level="error"}
# ✅ Better like this (one stream, content filter):
{app="loki"} |= "level=error"
When to add a label?
- Only when log volume in a single stream is large enough to fill a chunk (
chunk_target_size~1.5 MB) beforemax_chunk_ageexpires - Approximately: application generates 5+ MB of uncompressed logs before
max_chunk_age - Diagnostic query — reason for chunk flush:
sum by (reason) (rate(loki_ingester_chunks_flushed_total{cluster="dev"}[1m]))
General rules
- Aim for 10–15 labels maximum
- Start with brute force using filters (
|= "text",|~ "regex") — add labels only when that’s not sufficient - Labels should have a finite, predictable number of values
- Label names must match the regex:
[a-zA-Z_:][a-zA-Z0-9_:]* - Avoid the
__prefix (reserved for internal labels)
Structured Metadata
For high-cardinality data that you don’t want in labels but need in queries — use structured metadata. It allows storing additional context without creating new streams and without expanding the index.
Requires Loki 3.0+ (or manual enablement):
limits_config:
allow_structured_metadata: true
In the OpenTelemetry context: resource attributes that were not promoted to labels automatically go to structured metadata.
OpenTelemetry Compatibility
Loki natively supports OTel log ingestion over HTTP using the otlphttp exporter. Endpoint: http://<loki-addr>/otlp
OTel Data → Loki Mapping
| OTel Data | Where in Loki | Notes |
|---|---|---|
| Resource attributes (selected) | Index labels | Identify log source |
| Resource attributes (remaining) | Structured metadata | Don’t create streams |
| Scope attributes | Structured metadata | |
| Log attributes | Structured metadata | |
LogRecord.Body |
Log line | Non-string values converted to strings |
LogRecord.TimeUnixNano |
Timestamp | Fallback: ObservedTimestamp, then ingestion time |
Default resource attributes as index labels
By default, 17 resource attributes become labels:
| Category | Attributes |
|---|---|
| Cloud | cloud.availability_zone, cloud.region |
| Container | container.name |
| Kubernetes | k8s.cluster.name, k8s.namespace.name, k8s.pod.name, k8s.deployment.name, k8s.daemonset.name, k8s.statefulset.name, k8s.job.name, k8s.cronjob.name, k8s.replicaset.name, k8s.container.name |
| Service | service.name, service.namespace, service.instance.id |
| Deployment | deployment.environment.name |
⚠️ Note:
k8s.pod.nameandservice.instance.idhave high cardinality — for new deployments, it’s recommended to move them to structured metadata.
Loki has a default limit of 15 index labels — choose carefully which attributes to promote.
Name normalization
- Dots (
.) are replaced with underscores (_) — e.g.,service.name→service_name - Nested attributes are flattened with underscores
- Non-string values are converted to strings
OTel Collector configuration
exporters:
otlphttp:
endpoint: http://<loki-addr>/otlp
service:
pipelines:
logs:
exporters: [..., otlphttp]
Custom attribute mapping (per tenant)
You can change the default mapping via limits_config.otlp_config:
limits_config:
otlp_config:
resource_attributes:
ignore_defaults: true # Skip default mappings
attributes_config:
- action: index_label
attributes: [service.group]
- action: structured_metadata
attributes: [k8s.pod.name] # Move from labels to metadata
- action: drop
attributes: [telemetry.sdk.version]
log_attributes:
- action: structured_metadata
attributes: [user.id]
Available actions: index_label, structured_metadata, drop