Version: 0.6.5

Node configuration

The node configuration allows you to customize and optimize the settings for individual nodes in your cluster. It is divided into several sections:

Common configuration settings: shared top-level properties
Storage settings: defined in the storage section
Metastore settings: defined in the metastore section
Ingest settings: defined in the ingest_api section
Indexer settings: defined in the indexer section
Searcher settings: defined in the searcher section
Jaeger settings: defined in the jaeger section

A commented example is available here: quickwit.yaml.

Common configuration

Property	Description	Env variable	Default value
`version`	Config file version. `0.6` is the only available value with a retro compatibility on `0.5` and `0.4`.
`cluster_id`	Unique identifier of the cluster the node will be joining. Clusters sharing the same network should use distinct cluster IDs.	`QW_CLUSTER_ID`	`quickwit-default-cluster`
`node_id`	Unique identifier of the node. It must be distinct from the node IDs of its cluster peers. Defaults to the instance's short hostname if not set.	`QW_NODE_ID`	short hostname
`enabled_services`	Enabled services (control_plane, indexer, janitor, metastore, searcher)	`QW_ENABLED_SERVICES`	all services
`listen_address`	The IP address or hostname that Quickwit service binds to for starting REST and GRPC server and connecting this node to other nodes. By default, Quickwit binds itself to 127.0.0.1 (localhost). This default is not valid when trying to form a cluster.	`QW_LISTEN_ADDRESS`	`127.0.0.1`
`advertise_address`	IP address advertised by the node, i.e. the IP address that peer nodes should use to connect to the node for RPCs.	`QW_ADVERTISE_ADDRESS`	`listen_address`
`rest_listen_port`	The port which to listen for HTTP REST API.	`QW_REST_LISTEN_PORT`	`7280`
`gossip_listen_port`	The port which to listen for the Gossip cluster membership service (UDP).	`QW_GOSSIP_LISTEN_PORT`	`rest_listen_port`
`grpc_listen_port`	The port which to listen for the gRPC service.	`QW_GRPC_LISTEN_PORT`	`rest_listen_port + 1`
`peer_seeds`	List of IP addresses or hostnames used to bootstrap the cluster and discover the complete set of nodes. This list may contain the current node address and does not need to be exhaustive.	`QW_PEER_SEEDS`
`data_dir`	Path to directory where data (tmp data, splits kept for caching purpose) is persisted. This is mostly used in indexing.	`QW_DATA_DIR`	`./qwdata`
`metastore_uri`	Metastore URI. Can be a local directory or `s3://my-bucket/indexes` or `postgres://username:password@localhost:5432/metastore`. Learn more about the metastore configuration.	`QW_METASTORE_URI`	`{data_dir}/indexes`
`default_index_root_uri`	Default index root URI that defines the location where index data (splits) is stored. The index URI is built following the scheme: `{default_index_root_uri}/{index-id}`	`QW_DEFAULT_INDEX_ROOT_URI`	`{data_dir}/indexes`
`rest_cors_allow_origins`	Configure the CORS origins which are allowed to access the API. Read more

Storage configuration

Please refer to the dedicated storage configuration page to learn more about configuring Quickwit for various storage providers.

Here are also some minimal examples of how to configure Quickwit with Amazon S3 or Alibaba OSS:

AWS_ACCESS_KEY_ID=<your access key ID>
AWS_SECRET_ACCESS_KEY=<your secret access key>

Amazon S3

storage:
  s3:
    region: us-east-1

Alibaba

storage:
  s3:
    region: us-east-1
    endpoint: https://oss-us-east-1.aliyuncs.com

Metastore configuration

This section may contain one configuration subsection per available metastore implementation. The specific configuration parameters for each implementation may vary. Currently, the available metastore implementations are:

File-backed
PostgreSQL

File-backed metastore configuration

Property	Description	Default value
`polling_interval`	Time interval between successive polling attempts to detect metastore changes.	`30s`

Example of a metastore configuration for a file-backed implementation in YAML format:

metastore:
  file:
    polling_interval: 1m

PostgreSQL metastore configuration

Property	Description	Default value
`max_num_connections`	Determines the maximum number of concurrent connections to the database server.	`10`

Example of a metastore configuration for PostgreSQL in YAML format:

metastore:
  postgres:
    max_num_connections: 50

Indexer configuration

This section contains the configuration options for an indexer. The split store is documented in the indexing document.

Property	Description	Default value
`split_store_max_num_bytes`	Maximum size in bytes allowed in the split store for each index-source pair.	`100G`
`split_store_max_num_splits`	Maximum number of files allowed in the split store for each index-source pair.	`1000`
`max_concurrent_split_uploads`	Maximum number of concurrent split uploads allowed on the node.	`12`
`enable_otlp_endpoint`	If true, enables the OpenTelemetry exporter endpoint to ingest logs and traces via the OpenTelemetry Protocol (OTLP).	`false`

Example:

indexer:
  split_store_max_num_bytes: 100G
  split_store_max_num_splits: 1000
  max_concurrent_split_uploads: 12
  enable_otlp_endpoint: true

Ingest API configuration

Property	Description	Default value
`max_queue_memory_usage`	Maximum size in bytes of the in-memory Ingest queue.	`2GiB`
`max_queue_disk_usage`	Maximum disk-space in bytes taken by the Ingest queue. This is typically higher than the max in-memory queue.	`4GiB`

Example:

ingest_api:
  max_queue_memory_usage: 2GiB
  max_queue_disk_usage: 4GiB

Searcher configuration

This section contains the configuration options for a Searcher.

Property	Description	Default value
`aggregation_memory_limit`	Controls the maximum amount of memory that can be used for aggregations before aborting. This limit is per request and single leaf query (a leaf query is querying one or multiple splits concurrently). It is used to prevent excessive memory usage during the aggregation phase, which can lead to performance degradation or crashes. Since it is per request, concurrent requests can exceed the limit.	`500M`
`aggregation_bucket_limit`	Determines the maximum number of buckets returned to the client.	`65000`
`fast_field_cache_capacity`	Fast field cache capacity on a Searcher. If your filter by dates, run aggregations, range queries, or if you use the search stream API, or even for tracing, it might worth increasing this parameter. The metrics starting by `quickwit_cache_fastfields_cache` can help you make an informed choice when setting this value.	`1G`
`split_footer_cache_capacity`	Split footer cache (it is essentially the hotcache) capacity on a Searcher.	`500M`
`partial_request_cache_capacity`	Partial request cache capacity on a Searcher. Cache intermediate state for a request, possibly making subsequent requests faster. It can be disabled by setting the size to `0`.	`64M`
`max_num_concurrent_split_searches`	Maximum number of concurrent split search requests running on a Searcher.	`100`
`max_num_concurrent_split_streams`	Maximum number of concurrent split stream requests running on a Searcher.	`100`

Example:

searcher:
  fast_field_cache_capacity: 1G
  split_footer_cache_capacity: 500M
  partial_request_cache_capacity: 64M

Jaeger configuration

Property	Description	Default value
`enable_endpoint`	If true, enables the gRPC endpoint that allows the Jaeger Query Service to connect and retrieve traces.	`false`

Example:

searcher:
  enable_endpoint: true

Using environment variables in the configuration

You can use environment variable references in the config file to set values that need to be configurable during deployment. To do this, use:

${VAR_NAME}

where VAR_NAME is the name of the environment variable.

Each variable reference is replaced at startup by the value of the environment variable. The replacement is case-sensitive and occurs before the configuration file is parsed. Referencing undefined variables throws an error unless you specify a default value or custom error text.

To specify a default value, use:

${VAR_NAME:-default_value}

where default_value is the value to use if the environment variable is unset.

<config_field>: ${VAR_NAME}
or
<config_field>: ${VAR_NAME:-default value}

For example:

export QW_LISTEN_ADDRESS=0.0.0.0

# config.yaml
version: 0.6
cluster_id: quickwit-cluster
node_id: my-unique-node-id
listen_address: ${QW_LISTEN_ADDRESS}
rest_listen_port: ${QW_LISTEN_PORT:-1111}

Will be interpreted by Quickwit as:

version: 0.6
cluster_id: quickwit-cluster
node_id: my-unique-node-id
listen_address: 0.0.0.0
rest_listen_port: 1111

CORS (Cross-origin resource sharing) describes which address or origins can access the REST API from the browser. By default, sharing resources cross-origin is not allowed.

A wildcard, single origin, or multiple origins can be specified as part of the rest_cors_allow_origins parameter:

version: 0.6
index_id: hdfs

rest_cors_allow_origins: '*'                                 # Allow all origins
# rest_cors_allow_origins: https://my-hdfs-logs.domain.com   # Optionally we can specify one domain
# rest_cors_allow_origins:                                   # Or allow multiple origins
#   - https://my-hdfs-logs.domain.com
#   - https://my-hdfs.other-domain.com

Common configuration​

Storage configuration​

Metastore configuration​

File-backed metastore configuration​

PostgreSQL metastore configuration​

Indexer configuration​

Ingest API configuration​

Searcher configuration​

Jaeger configuration​

Using environment variables in the configuration​

Configuring CORS (Cross-origin resource sharing)​

Common configuration

Storage configuration

Metastore configuration

File-backed metastore configuration

PostgreSQL metastore configuration

Indexer configuration

Ingest API configuration

Searcher configuration

Jaeger configuration

Using environment variables in the configuration

Configuring CORS (Cross-origin resource sharing)