Components
Components
Distributor
Responsible for:
- Handling incoming streams from clients
- How it works:
- Each stream is validated for correctness
- Rate limiting according to tenant or global limits
- Chunks are then split into batches
- Sent to multiple ingesters in parallel. To avoid data loss, they are replicated according to the
replication_factor. Acknowledgement follows ring conditions.- Logs are distributed by label hash
- Who gets what is maintained in a hash ring (default: memberlist/gossip, optionally Consul or etcd)
- Each ingester has an assigned hash range and values are fetched based on that.
- Stateless
- Can be scaled independently
Ingester
- Responsible for writing to persistent storage
- Registered in the ring
- Write process:
- Data is written in chunks
- Chunks are held in memory, compressed, and marked as read-only when:
- Chunk size exceeds the write threshold (e.g., block size in Azure Table storage) - configurable value
- Enough time has passed without a write
- A flush occurred - e.g., the system is shutting down
- Flush is based on: tenant, labels, and content.
- Out of order writes:
- Since Loki 2.4+ support for out-of-order logs (
unordered_writes: true) - Enabled by default — logs can have older timestamps than the last written entry in a given stream
- Important in distributed environments where logs from different sources arrive with delays
- Since Loki 2.4+ support for out-of-order logs (
- Why are chunks important?
- A chunk is the basic unit of writing — its size and lifetime determine read performance and storage costs
- Too many small chunks = large index, slow queries, high I/O operation costs
- Too large chunks = long wait for flush, risk of data loss in memory during failures
- Goal: chunk should fill up to
chunk_target_size(default ~1.5 MB) beforemax_chunk_ageexpires
- Best practices
Query frontend
- Optional service
- For accelerating queries.
- When enabled → queries should go to it rather than directly to the Querier
- What does it do?
- Optimizes queries
- Queues requests
- Retries large queries
- Distributes large queries across multiple Queriers
- Splits large queries into smaller ones and executes them independently
- Caches queries and results
- Stateless, but should be run with at least 2 replicas
Querier
- Executes queries
- First queries Ingesters for what they have in memory
- Queries storage
- Since there are multiple replicas, it deduplicates results