Skip to main content

Building a log search service for under $7/month

In this blog post, we’ll show you how to build a simple log search service using AWS CDK and Quickwit Lambdas on a budget.

The default go-to solution to set up a search service on AWS is the OpenSearch service. However, you must be careful as cloud costs can easily get out of hand:

  • For production environments, the standard cluster setup can cost about $600/month1. Opting for OpenSearch Serverless further increases the cost to at least $700/month2.
  • For staging environments, you generally want to handle a portion of the production load, achieving this with OpenSearch still requires substantial costs, reaching several hundred dollars monthly for basic availability.
  • For development and testing, it is often useful to spin up multiple small instances. The minimal OpenSearch setup, single t3.small.search node, no replica, incurs a $25/month3, but is slow to instantiate and lacks resources for beefier tests...

In this post, I will show you how to set up a log search service that:

  • is 100x cheaper than OpenSearch Serverless, ingests close to 1 billion documents per month (100k every 5 minutes) and serves 30k queries per month for under $7/month.
  • has no fixed costs. You can keep long-running development and testing environments around for a few cents per month.

Today, I will be your guide in the meanders of AWS CDK and Quickwit Lambdas.

A concrete use case for our log search service

To illustrate our point, I went for a simple use case:

  • You have an application that generates JSON event logs and uploads them in batches to S3. To keep things simple, I hacked a small Lambda function to simulate this behavior.
  • You have users expecting a secured HTTP search service on those logs. We expect a low volume of queries on average, a few hundred per day. Request spikes are expected, typically during business hours, end of a sales cycle...

Let's see how to set up this service on AWS.

Cloud setup

Here is a simplified view of our search service stack:

Serverless Architecture

Let’s break it down piece by piece.

Event logs source

Event logs are generated by a Lambda, triggered every 5 minutes by an AWS EventBridge Scheduled Rule, which packs 100k JSON events into one file and uploads it to a Staging bucket. The data itself has a very simple structure, here is an event log sample:

{
"ts": 1707388485,
"id": 2048,
"name": "2048 rabbit",
"price": 5,
"quantity": 1,
"description": "A speedy rabbit with a fluffy tail.",
}

The Staging bucket is configured to send the notification that triggers the Quickwit Indexer Lambda each time an object is created by the Sales Data Generator.

Quickwit Service

The Quickwit Service in the middle is the central building block. It contains the necessary infrastructure to build and query a Quickwit index. It is composed of three main resources: the Indexer Lambda, the Index Bucket and the Searcher Lambda:

  • The Indexer Lambda can be invoked to load a gzipped JSON line delimited file from S3 and index it. It writes the generated index splits and associated metadata to the Index Bucket. Due to current limitations with the file based metastore, its reserved concurrency configuration is set to 1 to guarantee that only 1 indexer will write to the metastore file at any given time.
  • The Searcher Lambda does not have this limitation and multiple queries can be run in parallel. Note nevertheless that the Searcher is capable of leveraging a cache across consecutive invocations, compared to parallel queries which will end up being executed on different Lambda containers. So running queries sequentially will likely reduce their aggregated running duration as well as the total number of reads from S3.

The Searcher API

The Searcher API uses AWS API Gateway REST API, a managed service with a usage based pricing. This is perfect for our use case with a low volume of requests.

Our example contains a simplified configuration that mimics the search endpoint of the Quickwit REST API. The generated URL follows the pattern:

https://{api_id}.execute-api.{region}.amazonaws.com/api/v1/mock-sales/search

The endpoint is protected by an API key that you configure when deploying the stack.

Cost estimates

Before running a system in the Cloud, it is always a good idea to get a rough estimate of the associated costs. This example stack is generating, indexing and storing a hundred thousand events every few minutes. We consider the current pricing in the region us-east-1 of $0.00005 per second for our 3GB RAM Lambda functions. With a few back-of-the-envelope calculations, we get the following estimates:

  • For the Indexer Lambda, you might expect an associated cost in the order of $0.1 per day (300 executions of 5 seconds each).
  • Objects in the staging area expire after 1 day, so you will never have more than 3GB stored there ($0.05 per month).
  • Approximately 1GB of index is created per day. As data accumulates, storage becomes more and more expensive. A months’ worth of historical data (30GB) costs around $0.7 per month.
  • The cost of searches will likely remain small. Most queries complete in less than 1 second, allowing for around 10k queries per $1. For clients with auto-refresh, this threshold may be reached quickly, but in that case, the results will likely be served from the Lambda cache, resolving in under 100ms. In systems with repetitive queries, $1 could cover over 100k queries. For a deeper dive, see our post on Lambda search performance.
  • The data generator Lambda is not really part of the Search setup, but we can estimate its cost as well. It is triggered every 5 minutes, runs for 3 seconds and is a smaller 1GB RAM Lambda. Its cost should not exceed $0.5 per month.

These costs can be summarized in the following table:

ResourceCost ($)
Indexing Lambda3.00 / month
S3 storage for staging objects0.05 / month
S3 storage for the index0.70 / month (30 day retention)
Search Lambda0.10 / thousand search
Data generator Lambda0.50 / month

With a total budget of $7 per month, we can cover the indexing, the storage and around a thousand query per day.

Deployment

Prerequisites

We use AWS CDK for our infrastructure automation script. Install or upgrade it using npm:

npm install -g aws-cdk@latest

We also use the curl and make commands. For instance on Debian based distributions:

sudo apt update && sudo apt install curl make

You also need AWS credentials to be properly configured in your shell. One way is using the credentials file.

Finally, clone the Quickwit repository and install the Python dependencies (Python3.10 required) in a virtual environment:

git clone https://github.com/quickwit-oss/quickwit.git
cd quickwit/distribution/lambda
python3 -m venv .venv
source .venv/bin/activate
pip install .

Deploy

Configure the AWS region and account id where you want to deploy the example stack:

export CDK_ACCOUNT=123456789
export CDK_REGION=us-east-1

If this region/account pair was not bootstrapped by CDK yet, run:

make bootstrap

This initializes some basic resources to host artifacts such as Lambda packages.

Everything is ready! You can finally deploy the stack:

export SEARCHER_API_KEY=my-at-least-20-char-long-key 
make deploy-mock-data

If you don’t set SEARCHER_API_KEY, the Searcher API deployment is skipped.

danger

The API key is stored in plain text in the CDK stack. For a real world deployment, the key should be fetched from something like AWS Secrets Manager.

Query

Once the CDK deployment is completed, your example stack is up and running. The Sales Data Generator Lambda is going to be triggered every 5 minutes, which in turn will trigger the Indexer Lambda.

Around the end of the deployment logs, you’ll see a list of outputs. One of them is the URL of the search endpoint. Here is an example search query using curl where we look for all documents where the description contains the word "animal":

curl -d '{"query":"description:animal", "max_hits": 10}' \
-H "Content-Type: application/json" \
-H "x-api-key: my-at-least-20-char-long-key" \
-X POST \
https://{api_id}.execute-api.{region}.amazonaws.com/api/v1/mock-sales/search \
--compressed

The index is not created until the first run of the Indexer, so you might need a few minutes before your first search request succeeds. The API Gateway key configuration also takes a minute or two to propagate, so the first requests might receive an authorization error response.

Because the JSON query responses are often quite verbose, the Searcher Lambda always compresses them before sending them on the wire. It is crucial to keep this size low, both to avoid hitting the Lambda payload size limit of 6MB and to avoid egress costs at around $0.10/GB. We do this regardless of the accept-encoding request header, this is why the --compressed flag needs to be set to curl.

Cleaning up

Once you're done playing with the example stack, it is strongly recommended to delete the associated resources. In the shell where CDK_ACCOUNT, CDK_REGION and your AWS credentials are configured, run:

make destroy-mock-data

If you don’t want to tear down the infrastructure but want to make the costs associated with the stack negligible, you can just stop the source data generator. To do so, open the AWS Console, find the Sales Data Generator Lambda (it should be called something like MockDataStack-SourceMockDataGenerator{some_random_id}), and disable its EventBridge scheduled trigger. Without any data generated, the Indexer Lambda is not triggered either. You only pay a small fee for the S3 storage and the eventual queries you make on the dataset (both might even stay within your free tier if it isn’t already consumed by another application).

Alternative use cases

Firehose as a source

A very common way to land data on S3 is using AWS Firehose. It serves as a buffer between data sources that emit one or a few events at a time and S3 where manipulating small objects is often inefficient.

Querying without the API Gateway

API Gateway has the benefit of exposing the Lambda function as an HTTP Endpoint with custom authentication. When calling the Searcher directly from an AWS resource, such as another Lambda function or an EC2 instance, it might actually be simpler to call directly the AWS Lambda invoke API using an AWS SDK (e.g boto3 for Python). This leverages the AWS IAM roles for authentication and avoids the intermediate API Gateway layer.

Possible improvements

Quickwit Lambda is still in beta and some features might still be added to improve it:

  • The current indexer does not clean up the splits that are marked for deletion after a merge.
  • Merges runs in the background of the indexer and sometime don't have enough time to complete. We could optimize the merge execution strategy to decrease the split fragmentation without increasing the cost of the indexer.
  • The checkpoint list of ingested files is cleaned up entirely every 100 files (configurable). In very rare occasions if the indexer Lambda receives a duplicated S3 notification right after this pruning operation, the same file might be ingested twice. We could improve the checkpointing mechanism to avoid this.

If you are interested in any of these features or other ones, join us on Discord and share your use cases with us!


  1. From the AWS OpenSearch pricing page, the recommended r6g.large.search costs $0.167 per hour and the configurator recommends 3 master nodes and 2 data nodes.
  2. From the the AWS OpenSearch pricing page, the OCU is priced at $0.24 per OCU per hour and there is a minimum of 4 OCUs per domain.
  3. From the AWS OpenSearch pricing page, a t3.small.search costs $0.036 per hour