Components

Components

alt text

Distributor

Responsible for:

  • Handling incoming streams from clients
  • How it works:
    • Each stream is validated for correctness
    • Rate limiting according to tenant or global limits
    • Chunks are then split into batches
    • Sent to multiple ingesters in parallel. To avoid data loss, they are replicated according to the replication_factor. Acknowledgement follows ring conditions.
      • Logs are distributed by label hash
      • Who gets what is maintained in a hash ring (default: memberlist/gossip, optionally Consul or etcd)
      • Each ingester has an assigned hash range and values are fetched based on that.
  • Stateless
  • Can be scaled independently

Ingester

  • Responsible for writing to persistent storage
  • Registered in the ring
  • Write process:
    • Data is written in chunks
    • Chunks are held in memory, compressed, and marked as read-only when:
      • Chunk size exceeds the write threshold (e.g., block size in Azure Table storage) - configurable value
      • Enough time has passed without a write
      • A flush occurred - e.g., the system is shutting down
      • Flush is based on: tenant, labels, and content.
  • Out of order writes:
    • Since Loki 2.4+ support for out-of-order logs (unordered_writes: true)
    • Enabled by default — logs can have older timestamps than the last written entry in a given stream
    • Important in distributed environments where logs from different sources arrive with delays
  • Why are chunks important?
    • A chunk is the basic unit of writing — its size and lifetime determine read performance and storage costs
    • Too many small chunks = large index, slow queries, high I/O operation costs
    • Too large chunks = long wait for flush, risk of data loss in memory during failures
    • Goal: chunk should fill up to chunk_target_size (default ~1.5 MB) before max_chunk_age expires
  • Best practices

Query frontend

  • Optional service
  • For accelerating queries.
  • When enabled → queries should go to it rather than directly to the Querier
  • What does it do?
    • Optimizes queries
    • Queues requests
    • Retries large queries
    • Distributes large queries across multiple Queriers
    • Splits large queries into smaller ones and executes them independently
    • Caches queries and results
  • Stateless, but should be run with at least 2 replicas

Querier

  • Executes queries
    • First queries Ingesters for what they have in memory
    • Queries storage
    • Since there are multiple replicas, it deduplicates results

results matching ""

    No results matching ""