Skip to main content
Version: 0.6.5

Node configuration

The node configuration allows you to customize and optimize the settings for individual nodes in your cluster. It is divided into several sections:

  • Common configuration settings: shared top-level properties
  • Storage settings: defined in the storage section
  • Metastore settings: defined in the metastore section
  • Ingest settings: defined in the ingest_api section
  • Indexer settings: defined in the indexer section
  • Searcher settings: defined in the searcher section
  • Jaeger settings: defined in the jaeger section

A commented example is available here: quickwit.yaml.

Common configuration

PropertyDescriptionEnv variableDefault value
versionConfig file version. 0.6 is the only available value with a retro compatibility on 0.5 and 0.4.
cluster_idUnique identifier of the cluster the node will be joining. Clusters sharing the same network should use distinct cluster IDs.QW_CLUSTER_IDquickwit-default-cluster
node_idUnique identifier of the node. It must be distinct from the node IDs of its cluster peers. Defaults to the instance's short hostname if not set.QW_NODE_IDshort hostname
enabled_servicesEnabled services (control_plane, indexer, janitor, metastore, searcher)QW_ENABLED_SERVICESall services
listen_addressThe IP address or hostname that Quickwit service binds to for starting REST and GRPC server and connecting this node to other nodes. By default, Quickwit binds itself to 127.0.0.1 (localhost). This default is not valid when trying to form a cluster.QW_LISTEN_ADDRESS127.0.0.1
advertise_addressIP address advertised by the node, i.e. the IP address that peer nodes should use to connect to the node for RPCs.QW_ADVERTISE_ADDRESSlisten_address
rest_listen_portThe port which to listen for HTTP REST API.QW_REST_LISTEN_PORT7280
gossip_listen_portThe port which to listen for the Gossip cluster membership service (UDP).QW_GOSSIP_LISTEN_PORTrest_listen_port
grpc_listen_portThe port which to listen for the gRPC service.QW_GRPC_LISTEN_PORTrest_listen_port + 1
peer_seedsList of IP addresses or hostnames used to bootstrap the cluster and discover the complete set of nodes. This list may contain the current node address and does not need to be exhaustive.QW_PEER_SEEDS
data_dirPath to directory where data (tmp data, splits kept for caching purpose) is persisted. This is mostly used in indexing.QW_DATA_DIR./qwdata
metastore_uriMetastore URI. Can be a local directory or s3://my-bucket/indexes or postgres://username:password@localhost:5432/metastore. Learn more about the metastore configuration.QW_METASTORE_URI{data_dir}/indexes
default_index_root_uriDefault index root URI that defines the location where index data (splits) is stored. The index URI is built following the scheme: {default_index_root_uri}/{index-id}QW_DEFAULT_INDEX_ROOT_URI{data_dir}/indexes
rest_cors_allow_originsConfigure the CORS origins which are allowed to access the API. Read more

Storage configuration

Please refer to the dedicated storage configuration page to learn more about configuring Quickwit for various storage providers.

Here are also some minimal examples of how to configure Quickwit with Amazon S3 or Alibaba OSS:

AWS_ACCESS_KEY_ID=<your access key ID>
AWS_SECRET_ACCESS_KEY=<your secret access key>

Amazon S3

storage:
s3:
region: us-east-1

Alibaba

storage:
s3:
region: us-east-1
endpoint: https://oss-us-east-1.aliyuncs.com

Metastore configuration

This section may contain one configuration subsection per available metastore implementation. The specific configuration parameters for each implementation may vary. Currently, the available metastore implementations are:

  • File-backed
  • PostgreSQL

File-backed metastore configuration

PropertyDescriptionDefault value
polling_intervalTime interval between successive polling attempts to detect metastore changes.30s

Example of a metastore configuration for a file-backed implementation in YAML format:

metastore:
file:
polling_interval: 1m

PostgreSQL metastore configuration

PropertyDescriptionDefault value
max_num_connectionsDetermines the maximum number of concurrent connections to the database server.10

Example of a metastore configuration for PostgreSQL in YAML format:

metastore:
postgres:
max_num_connections: 50

Indexer configuration

This section contains the configuration options for an indexer. The split store is documented in the indexing document.

PropertyDescriptionDefault value
split_store_max_num_bytesMaximum size in bytes allowed in the split store for each index-source pair.100G
split_store_max_num_splitsMaximum number of files allowed in the split store for each index-source pair.1000
max_concurrent_split_uploadsMaximum number of concurrent split uploads allowed on the node.12
enable_otlp_endpointIf true, enables the OpenTelemetry exporter endpoint to ingest logs and traces via the OpenTelemetry Protocol (OTLP).false

Example:

indexer:
split_store_max_num_bytes: 100G
split_store_max_num_splits: 1000
max_concurrent_split_uploads: 12
enable_otlp_endpoint: true

Ingest API configuration

PropertyDescriptionDefault value
max_queue_memory_usageMaximum size in bytes of the in-memory Ingest queue.2GiB
max_queue_disk_usageMaximum disk-space in bytes taken by the Ingest queue. This is typically higher than the max in-memory queue.4GiB

Example:

ingest_api:
max_queue_memory_usage: 2GiB
max_queue_disk_usage: 4GiB

Searcher configuration

This section contains the configuration options for a Searcher.

PropertyDescriptionDefault value
aggregation_memory_limitControls the maximum amount of memory that can be used for aggregations before aborting. This limit is per request and single leaf query (a leaf query is querying one or multiple splits concurrently). It is used to prevent excessive memory usage during the aggregation phase, which can lead to performance degradation or crashes. Since it is per request, concurrent requests can exceed the limit.500M
aggregation_bucket_limitDetermines the maximum number of buckets returned to the client.65000
fast_field_cache_capacityFast field cache capacity on a Searcher. If your filter by dates, run aggregations, range queries, or if you use the search stream API, or even for tracing, it might worth increasing this parameter. The metrics starting by quickwit_cache_fastfields_cache can help you make an informed choice when setting this value.1G
split_footer_cache_capacitySplit footer cache (it is essentially the hotcache) capacity on a Searcher.500M
partial_request_cache_capacityPartial request cache capacity on a Searcher. Cache intermediate state for a request, possibly making subsequent requests faster. It can be disabled by setting the size to 0.64M
max_num_concurrent_split_searchesMaximum number of concurrent split search requests running on a Searcher.100
max_num_concurrent_split_streamsMaximum number of concurrent split stream requests running on a Searcher.100

Example:

searcher:
fast_field_cache_capacity: 1G
split_footer_cache_capacity: 500M
partial_request_cache_capacity: 64M

Jaeger configuration

PropertyDescriptionDefault value
enable_endpointIf true, enables the gRPC endpoint that allows the Jaeger Query Service to connect and retrieve traces.false

Example:

searcher:
enable_endpoint: true

Using environment variables in the configuration

You can use environment variable references in the config file to set values that need to be configurable during deployment. To do this, use:

${VAR_NAME}

where VAR_NAME is the name of the environment variable.

Each variable reference is replaced at startup by the value of the environment variable. The replacement is case-sensitive and occurs before the configuration file is parsed. Referencing undefined variables throws an error unless you specify a default value or custom error text.

To specify a default value, use:

${VAR_NAME:-default_value}

where default_value is the value to use if the environment variable is unset.

<config_field>: ${VAR_NAME}
or
<config_field>: ${VAR_NAME:-default value}

For example:

export QW_LISTEN_ADDRESS=0.0.0.0
# config.yaml
version: 0.6
cluster_id: quickwit-cluster
node_id: my-unique-node-id
listen_address: ${QW_LISTEN_ADDRESS}
rest_listen_port: ${QW_LISTEN_PORT:-1111}

Will be interpreted by Quickwit as:

version: 0.6
cluster_id: quickwit-cluster
node_id: my-unique-node-id
listen_address: 0.0.0.0
rest_listen_port: 1111

Configuring CORS (Cross-origin resource sharing)

CORS (Cross-origin resource sharing) describes which address or origins can access the REST API from the browser. By default, sharing resources cross-origin is not allowed.

A wildcard, single origin, or multiple origins can be specified as part of the rest_cors_allow_origins parameter:

version: 0.6
index_id: hdfs

rest_cors_allow_origins: '*' # Allow all origins
# rest_cors_allow_origins: https://my-hdfs-logs.domain.com # Optionally we can specify one domain
# rest_cors_allow_origins: # Or allow multiple origins
# - https://my-hdfs-logs.domain.com
# - https://my-hdfs.other-domain.com