Grafana Mimir

Mimir is the long-term metrics storage backend. It provides 100% Prometheus compatibility (PromQL, remote_write API) while adding horizontal scalability, multi-tenancy, and durable object storage.

Role in the Stack

Function	Details
Long-term retention	Stores metrics in Azure Blob Storage for weeks/months
Horizontal scalability	Each component scales independently
Multi-tenancy	Isolates metrics by tenant via `X-Scope-OrgID` header
Prometheus compatibility	100% PromQL — same queries work against both Prometheus and Mimir
Exemplar storage	Stores exemplars for trace correlation on long-term metrics
Caching	Memcached for chunks, query results, and metadata

Why Mimir Over Prometheus Alone?

Challenge with Prometheus	Mimir Solution
Limited retention (memory + local disk)	Object storage = unlimited retention
Memory constraints on high cardinality	Horizontal scaling of ingesters
No multi-tenancy	Built-in tenant isolation
Query latency on large datasets	Query splitting, parallelism, Memcached caching
Single point of failure	Replicated components, HA by design

Why Not Only Mimir?

If Mimir solves all of Prometheus’s limitations, why keep Prometheus at all? Because they serve different roles:

Concern	Why Prometheus Stays
Query latency	Prometheus serves recent metrics from memory — sub-millisecond response for the last few hours. Mimir must fetch from object storage for anything beyond the ingester buffer window
Alerting evaluation	Prometheus evaluates alert rules locally against in-memory data. Using Mimir for alerting adds network hops and dependency on the full Mimir stack being healthy
Operational simplicity	Prometheus is a single binary with no external dependencies. If Mimir’s ingesters, store-gateways, or object storage have issues, Prometheus still works
Bootstrapping	Prometheus can run without Mimir. Mimir cannot run without something writing metrics to it
Cost	Not every metric needs long-term retention. Prometheus handles short-lived, high-churn metrics cheaply without sending them to object storage

In our setup the split is intentional:

Prometheus = fast, local, always-available metrics for the last few hours + alerting
Mimir = durable, scalable storage for long-term queries and dashboards

Both are exposed as separate Grafana datasources — use Prometheus for real-time dashboards and alerts, Mimir for historical analysis and capacity planning.

Versions


Chart	`grafana/mimir-distributed` 6.0.6
Mimir	3.0.4
Strimzi Kafka Operator	`strimzi/strimzi-kafka-operator` chart 1.0.0
Kafka	Apache Kafka 4.1.0 (`quay.io/strimzi/kafka:1.0.0-kafka-4.1.0`)

Deployment — Microservices Mode (Kafka-backed ingest storage)

Mimir 3.0 (chart mimir-distributed 6.0+) introduced a new write-path architecture: distributors no longer push samples directly to ingesters over gRPC. Instead, distributors produce to a Kafka topic, and ingesters consume from it asynchronously. This decouples the write path from the ingester ring — a distributor’s POST /api/v1/push completes as soon as Kafka has accepted the write.

Component	Replicas	CPU	RAM	Storage	Purpose
Nginx Gateway	2	10m	16Mi	—	Entry point, load balancing
Distributor	2	100m	256Mi	—	Validates writes, produces to Kafka topic `mimir-ingest`
Kafka (Strimzi)	3 (KRaft mixed-mode)	500m	1Gi	20Gi PV each	Durable write buffer between distributors and ingesters (100 partitions, RF=3, min.insync.replicas=2)
Ingester	3 (zone-a/b/c)	100m	512Mi	10Gi PV	Consumes its Kafka partitions, builds TSDB blocks, flushes to object storage
Querier	2	100m	256Mi	—	Parallel query execution
Query Frontend	2	100m	256Mi	—	Query splitting, caching via Memcached
Query Scheduler	2	10m	32Mi	—	Fair queuing across tenants
Store Gateway	2	100m	256Mi	10Gi PV	Reads historical blocks from object storage
Compactor	1	100m	256Mi	10Gi PV	Merges blocks, enforces retention
Ruler	1	100m	256Mi	—	Evaluates recording rules and alerts

The Kafka cluster is provisioned by the Strimzi Kafka Operator (CNCF, operator chart strimzi/strimzi-kafka-operator 1.0.0). Three KRaft mixed-mode brokers (each plays both controller and broker roles — no Zookeeper), default.replication.factor: 3, min.insync.replicas: 2. Production deployments swap this for a fully separate Kafka cluster (Strimzi at scale, MSK, Confluent Cloud); the workshop’s 3-broker layout is enough to demonstrate the architecture and survive one broker loss. The Mimir distributors and ingesters connect to a single bootstrap service: mimir-kafka-kafka-bootstrap.monitoring.svc.cluster.local:9092.

What Feeds Into Mimir

Source	Signal	Path
Prometheus	All scraped metrics	Prometheus remote write → `mimir-gateway:80/api/v1/push` → distributor → Kafka → ingester
Tempo	Span metrics (RED), service graphs, TraceQL metrics	Tempo metrics generator → Prometheus → Mimir (same Kafka write path)

Storage

Backend: Azure Blob Storage
Containers: mimir-storage (blocks), mimir-blocks, mimir-alertmanager, mimir-ruler
Format: Prometheus TSDB blocks

Limits

Limit	Value
Max global series per user	150,000
Max global series per metric	20,000
Ingestion rate	10,000 samples/sec
Ingestion burst size	200,000
Out-of-order time window	10 minutes

Integration with Other Components

Tempo metrics generator → Prometheus → Mimir — Tempo generates RED metrics (rate, errors, duration), service graphs, and TraceQL metrics from trace spans and writes them to Prometheus via remote write. Prometheus then forwards them to Mimir for long-term storage. This is the source data for Traces Drilldown and the Service Map in Grafana.

Grafana exemplars — Mimir stores exemplars (metric → trace ID links) enabling click-through from metric graphs to traces in Tempo.

Grafana Datasource

Type: prometheus (100% compatible)
URL: http://mimir-gateway.monitoring.svc.cluster.local:80/prometheus
Exemplars: Enabled — links to Tempo by TraceID
Use for: Long-term queries, dashboards with wide time ranges, span-derived metrics

Auto-Scaling Best Practices

Mimir is built for horizontal scaling in microservices mode. Each component can be independently auto-scaled with Kubernetes HPA.

Which Components to Auto-Scale

Component	Auto-scalable?	Scale trigger	Notes
Distributor	✅ Yes	CPU, incoming sample rate	Stateless — scale freely
Ingester	⚠️ With care	Memory, active series	Stateful — ring member, holds data in memory (~4h before flush)
Querier	✅ Yes	CPU, query queue depth	Stateless — more queriers = faster query execution
Query Frontend	⚠️ Rarely needed	—	2 replicas usually enough — splits queries, doesn’t execute them
Query Scheduler	❌ No	—	Lightweight, 2 replicas fixed
Store Gateway	✅ Yes	CPU, memory	Stateful (caches blocks), but supports ring-based sharding
Compactor	❌ No	—	Singleton — one instance per tenant shard
Ruler	✅ Yes	CPU, number of rules	Uses ring for rule sharding

HPA Examples

Distributor:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: mimir-distributor
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mimir-distributor
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Querier:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: mimir-querier
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mimir-querier
  minReplicas: 2
  maxReplicas: 15
  metrics:
    - type: Pods
      pods:
        metric:
          name: cortex_query_scheduler_queue_length
        target:
          type: AverageValue
          averageValue: "5"

Ingester Scaling — Critical Considerations

Mimir ingesters under the Kafka-backed ingest_storage architecture are still stateful, but their durability story is different from the classic architecture: durability is provided by Kafka, not by the ingester ring. An ingester restart no longer risks data loss for samples already accepted by the distributor — those samples are persisted in the Kafka topic. The ingester’s job is to consume its assigned partitions and turn them into TSDB blocks.

Partition ring

Each ingester registers in an ingester-partitions ring (separate from the legacy ingester ring used for query-time lookups). A partition transitions from PartitionPending → PartitionActive once a minimum number of owners (default 1) have been registered for the minimum waiting time (default 10s). Distributors only produce to active partitions.

If the partition ring is empty, distributors reject writes with DoBatch: InstancesCount <= 0. This is the symptom of either: no ingesters running, ingesters stuck before partition registration, or stale PVC state from a previous classic-architecture deployment.

Safe scale-up:

New ingester joins the partition ring and claims its share of partitions
No live data migration between ingesters — Kafka holds the source of truth
Topic must have enough partitions for the target ingester count (chart default: 100 partitions, comfortable up to ~100 ingesters)

Safe scale-down:

Ingester unsubscribes from its Kafka partitions on shutdown
Active partitions are reclaimed by surviving owners after the heartbeat-timeout window
Generous terminationGracePeriodSeconds is still recommended (e.g., 300s) so the ingester can flush its current TSDB head block to object storage before exiting; failure to do so just means the block is re-built from Kafka by the new owner (some duplicate I/O, no data loss)

terminationGracePeriodSeconds: 300
lifecycle:
  preStop:
    httpGet:
      path: /ingester/shutdown
      port: http-metrics

Store Gateway Scaling

Store gateways cache block metadata and index data on local disk. Scaling them improves query performance for historical data.

Uses ring-based sharding — each store gateway is responsible for a subset of blocks
New replicas join the ring and gradually take ownership of blocks
Scale based on query latency for wide time-range queries
Ensure PVC provisioning is fast — slow disk attachment delays scale-up

Key Metrics for Auto-Scaling

# Distributor — incoming samples/sec
rate(cortex_distributor_received_samples_total[5m])

# Ingester — active series (primary memory driver)
cortex_ingester_active_series

# Querier — queue depth
cortex_query_scheduler_queue_length

# Store Gateway — block load time
cortex_bucket_store_block_load_duration_seconds

# Overall write health
rate(cortex_request_duration_seconds_count{route="/api/v1/push"}[5m])

General Guidelines

Distributors are the easiest win — stateless, scale aggressively on CPU or incoming sample rate
Queriers are the second priority — scale on queue depth for faster query response
Ingesters still need a PodDisruptionBudget (maxUnavailable: 1) so partition ownership can rebalance cleanly during rolling updates
Set minReplicas ≥ 1 per zone for zone-aware ingesters; for ingest_storage durability is in Kafka, not the ingester count
Kafka is the new bottleneck on the write path — monitor Kafka producer lag and broker resource pressure; scale partition count or broker resources before scaling ingesters
Store gateways benefit from more replicas when query latency on historical data is high
Write path and read path scale independently — ingestion spikes don’t correlate with query load
Monitor partition-ring health after every scaling event: cortex_ingester_partition_ring_partitions — partitions stuck in PartitionPending mean ownership isn’t being claimed
Use KEDA for custom metric-based scaling (e.g., consumer lag, active series) when HPA with Prometheus adapter is too complex

Grafana Mimir

Grafana Mimir

Role in the Stack

Why Mimir Over Prometheus Alone?

Why Not Only Mimir?

Versions

Deployment — Microservices Mode (Kafka-backed ingest storage)

What Feeds Into Mimir

Storage

Limits

Integration with Other Components

Grafana Datasource

Auto-Scaling Best Practices

Which Components to Auto-Scale

HPA Examples

Ingester Scaling — Critical Considerations

Store Gateway Scaling

Key Metrics for Auto-Scaling

General Guidelines

results matching ""

No results matching ""