Benchmarking Quickwit vs. Loki
In 2019, Grafana launched Loki, a new log aggregation system, to tackle the challenges commonly faced by teams operating and scaling Elasticsearch:
- High total cost of ownership (TCO)
- Slow indexing
- Difficult to scale to terabytes
- Complex and hard to manage
The consensus was that Elasticsearch wasn't the right solution for efficient log management. Loki aimed to provide a cost-efficient through a minimalistic indexing approach that indexes only metadata (labels), inspired by Prometheus’s labeling system, and leveraged cheap object storage. Humio1 had a similar approach with its unique index-free architecture.
In 2021, we launched Quickwit with the same goal but a different approach. While we also targeted cost-efficiency and use object storage, we maintained the inverted index, combining it with a columnar storage. Our ultimate goal was to provide a search engine on object storage, retaining the strengths of traditional search engines while addressing the good complaints coming from Elasticsearch users.
Today, as Quickwit is gaining traction and being adopted by companies at petabyte scale2, we are often asked about the trade-offs of using Quickwit vs. Loki for log management.
This blog post explores these differences, evaluates their trade-offs, and benchmarks them against a reasonably large log dataset to provide relevant insights. As benchmarking another tool against one's own is very challenging, I strove to present a balanced view with humility, welcoming input from both the Loki and Quickwit communities. I'm convinced that the benchmark results are worth sharing, and I hope you will gain a better understanding of the two engines!
Establishing a Fair Comparison
Quickwit and Loki target the same use case: log search, with a shared goal of cost-effectiveness at terabyte scale and beyond. They achieve this by decoupling computing and storage, optimizing data compression, and maintaining stateless architectures to simplify cluster management.
However, their querying capabilities are substantially different:
- Quickwit offers a simplified query language: it typically supports keyword queries, prefix queries, and phrase queries but lacks support for regex queries.
- Loki supports a broader set of queries, including regex capabilities and log content parsing for field extraction.
The benchmark will thus evaluate a subset of the Loki query language since Quickwit is less powerful there. Considering the architectures of both engines, we can already anticipate the performance characteristics that we should observe:
- Quickwit builds an inverted index and columnar store, it should offer faster search and analytics capabilities at the expense of higher ingestion costs.
- Loki favors quick ingestion but should exhibit slower query responses as it is mostly index-free, only labels are used to categorize data into streams.
The main goal of the benchmark is to materialize the trade-offs between Quickwit and Loki on a log dataset.
Benchmark setup
For this benchmark, we considered a typical scenario of ingesting hundreds of GBs of logs per day. We used the Elastic Integration Corpus Generator Tool to create our dataset, which is publicly available in our Google Storage bucket. The dataset contains the first 200 files (generated-logs-v1-*), totaling approximately 212.40 GB across 243,527,673 logs.
The JSON structure of our logs is as follows:
{
"agent": {
"ephemeral_id": "9d0fd4b2-0cf1-4b9b-9ad1-61e46657134d",
"id": "9d0fd4b2-0cf1-4b9b-9ad1-61e46657134d",
"name": "coldraccoon",
"type": "filebeat",
"version": "8.8.0"
},
"aws.cloudwatch": {
"ingestion_time": "2023-09-17T13:31:04.741Z",
"log_stream": "novachopper"
},
"cloud": {
"region": "us-east-1"
},
"event": {
"dataset": "generic",
"id": "peachmare",
"ingested": "2023-09-17T12:48:00.741424Z"
},
"host": {
"name": "coldraccoon"
},
"input": {
"type": "aws-cloudwatch"
},
"level": "INFO",
"log.file.path": "/var/log/messages/novachopper",
"message": "2023-09-17T13:31:04.741Z Sep 17 13:31:04 ip-187-57-167-52 systemd: jackal fancier hero griffin finger scale fireroar",
"metrics": {
"size": 390145,
"tmin": 68811
},
"process": {
"name": "systemd"
},
"tags": [
"preserve_original_event"
],
"timestamp": 1673247599,
"trace_id": "5161051656584663225"
}
Evaluation Metrics
We assessed several metrics during data ingestion and querying:
- Ingestion time, CPU utilization, index size, peak RAM usage, and object storage file count.
- Query performance measured by latency, CPU time, and the number of GET requests on object storage.
To stick to the log search use case, we focused on the two types of queries commonly used in the Grafana Explore view:
- Fetching the last 100 logs.
- Getting the log volume by log level.
The following table illustrates the benchmarked queries:
Query | Last 100 logs | Log volume per log level |
---|---|---|
Match all logs | [ ] | [x] |
Logs with queen | [x] | [x] |
Logs labeled region: us-east-2 | [x] | [x] |
Logs with label region: us-east-2 and queen | [x] | [x] |
Note that the word queen
is present in 3% of the logs.
Each query was run 10 times on each engine and we reported the average of each metric in the results section.
Engines Configuration Details
Both engines were configured to use Google Cloud Storage, with all internal caching disabled to ensure a fair comparison.
Loki Configuration
Loki 2.9 was used and configured to optimize for various parameters such as chunk_encoding: snappy
and remove ingestion limits.
We used Vector to ingest logs into Loki, we configured it with labels for region and log levels only, creating up to 100 distinct streams.
We tried to configure different sets of labels, but it was hard to control the cardinality of this dataset. Each time we declared fields such as host.name
as labels, the cardinality exploded! We even created buckets of hostnames available here and here to limit the cardinality to 1k or 25k label values. However, we found this process impractical and decided to leave the results of these experiments out of this benchmark. Feel free to reach out on Discord if you want to know more about the results, we will certainly publish them too on the GitHub repository.
For reference, we found the following blog posts quite valuable for learning how to configure Loki properly:
Quickwit configuration
Quickwit latest build was used, it contains some performance optimizations that will be available in the next release in June. We used Quickwit default config and the following index config for logs:
version: 0.8
index_id: generated-logs-for-loki
doc_mapping:
mode: dynamic
field_mappings:
- name: timestamp
type: datetime
fast: true
input_formats:
- rfc3339
- name: message
type: text
timestamp_field: timestamp
Hardware
We used one instance n2-standard-16
, the CPU is an Intel Cascade Lake (x86/64) with 16 vCPUs and 64GB of memory.
Benchmark results
Ingestion
The table below details the ingestion metrics, where Quickwit exhibits a slower indexing speed but generates fewer files stored on object storage.
Engine | Quickwit | Loki |
---|---|---|
Ingestion time (min) | 123 (+123%) | 55 |
Mean vCPU | 2.2 | 2.75 |
Total CPU time (min) | ~270 (+80%) | ~151 |
Number of files on GCS | 25 | 145,756 (x5,829) |
Bucket size (GiB) | 53 | 55 |
As expected, Quickwit is slower at indexing and consumes about 80% more CPU resources.
However, Quickwit creates significantly fewer files on object storage. Quickwit ends up with only 25 files against 145k for Loki. Note that querying 145k files will cost you $0.06 on AWS S3 (GET requests cost you $0.0004 per 1000 requests). The low number of files for Quickwit is a result of Quickwit’s merge strategy, which combines smaller files into fewer, larger ones, optimizing both storage efficiency and retrieval performance. This also makes Quickwit consume more CPU.
Finally, Quickwit index size is comparable to Loki, and the cost of storage of both engines will thus be equivalent.
Log queries
The following table showcases the query performance metrics for both engines:
Query | Metric | Quickwit | Loki |
---|---|---|---|
`queen` | Latency (s) | 0.6s | 9.3s (+1,425%) |
CPU Time (s) | 2.7s | 146s (+5,270%) | |
# of GET Requests | 206 | 14,821* (+7,095%) | |
`us-east-2` (label) | Latency (s) | 0.6s | 1.0s (+74%) |
CPU Time (s) | 2.7s (+35%) | 2s | |
# of GET Requests | 211* (+348%) | 47 | |
`us-east-2` (label) and `queen` | Latency (s) | 0.6s | 0.98s (+59%) |
CPU Time (s) | 2.8s | 11s (+279%) | |
# of GET Requests | 255 | 561* (+120%) |
As expected, searching a keyword in the whole dataset is very costly with Loki, both in CPU and in the number of GET requests. Looking at CPU time, searching in a given label in Loki still takes more CPU time than in Quickwit, +435% CPU time compared to +5270% when querying the whole dataset. We can thus see the benefit of choosing relevant labels, which becomes the key parameter to configure in Loki.
When measuring Loki GET requests, we noticed significant variations of GET requests (up to 2x) for the same queries, we suspect there are some caching mechanisms still active, but we did not manage to disable them.
Note that Loki 3.0 recently introduced bloom filters to improve performance on pure “needle in a haystack” queries. However that feature would not help with queen
queries. Indeed, the keyword is present in 3% of the logs and will be present with extremely high probability in all the chunks, preventing bloom filters from decreasing the volume of data to scan3.
Log volume by level queries
The following table showcases the log volume query performance metrics for both engines:
Query | Metric | Quickwit | Loki |
---|---|---|---|
All dataset | Latency (s) | 2.1s | 90s (x42) |
CPU Time (s) | 22s | 1,160s (x51) | |
# of GET Requests | 88 | 203,665 (x2,313) | |
`queen` | Latency (s) | 0.4s | 565s (x11,423) |
CPU Time (s) | 3.2s | 8,713s (x12,688) | |
# of GET Requests | 132 | 204,622 (x1,549) | |
`us-east-2` (label) | Latency (s) | 0.6s | 4.8s (+685%) |
CPU Time (s) | 2.8s | 40s (x13) | |
# of GET Requests | 211 | 6,163 (x28) | |
`us-east-2` (label) and `queen` | Latency (s) | 0.4s | 28s (x70) |
CPU Time (s) | 2.9s | 337s (x115) | |
# of GET Requests | 176 | 5,596 (x31) |
Quickwit consistently outperforms Loki on the log volume queries: it showcases the benefit of the inverted index combined with the columnar storage. More generally speaking, Loki use a lot of CPU to derive metrics from logs even when labels are correclty used.
Reproducing the benchmark
To ensure transparency and reproducibility, all resources used in this benchmark are available in our benchmarks repository. We detailed how to reproduce the benchmark on the dedicated Loki page. We're happy to receive PRs as we will continue to improve the benchmarks and progressively add other observability engines: Loki 3.0 and OpenSearch are next on the roadmap.
Conclusion
The benchmark reveal that Quickwit, while slower at data ingestion, performs very well in search, especially for analytics-driven queries. This analysis highlights the trade-offs between Quickwit's intensive indexing phase and its efficient querying capabilities. Loki's broader query language enables more diverse use cases, which is a substantial advantage despite its slower query performance on large datasets.
As we continue to enhance Quickwit and Grafana does the same for Loki with new features such as bloom filters and Quickwit expanding its query capabilities, the landscape of log search engines will see exciting developments!
Happy testing!
1 Humio was founded in 2016 and is famous for its index-free logging platform. It was acquired by Crowdstrike in 2021.
2 See the release blog post on scaling indexing and search at petabyte scale.
3 You can do the math yourself: 243,527,673 logs / 145,756 chunks = 1,670 logs per chunk on average; probability of a chunk to not contain the word queen
= 1 - (1 - 0.03)^1,670 = 99.99...%.