Skip to main content
Version: main branch

REST API

API version

All the API endpoints start with the api/v1/ prefix. v1 indicates that we are currently using version 1 of the API.

OpenAPI specification

The OpenAPI specification of the REST API is available at /openapi.json and a Swagger UI version is available at /ui/api-playground.

Parameters

Parameters passed in the URL must be properly URL-encoded, using the UTF-8 encoding for non-ASCII characters.

GET [..]/search?query=barack%20obama

Error handling

Successful requests return a 2xx HTTP status code.

Failed requests return a 4xx HTTP status code. The response body of failed requests holds a JSON object containing a message field that describes the error.

{
"message": "Failed to parse query"
}

Search API

Search in an index

Search for documents matching a query in the given index api/v1/<index id>/search. This endpoint is available as long as you have at least one node running a searcher service in the cluster. The search endpoint accepts GET and POST requests. The parameters are URL parameters for GET requests or JSON key-value pairs for POST requests.

GET api/v1/<index id>/search?query=searchterm
POST api/v1/<index id>/search
{
"query": searchterm
}

Path variable

VariableDescription
index idThe index id

Parameters

VariableTypeDescriptionDefault value
queryStringQuery text. See the query language docrequired
start_timestampi64If set, restrict search to documents with a timestamp >= start_timestamp, taking advantage of potential time pruning opportunities. The value must be in seconds.
end_timestampi64If set, restrict search to documents with a timestamp < end_timestamp, taking advantage of potential time pruning opportunities. The value must be in seconds.
start_offsetIntegerNumber of documents to skip0
max_hitsIntegerMaximum number of hits to return (by default 20)20
search_field[String]Fields to search on if no field name is specified in the query. Comma-separated list, e.g. "field1,field2"index_config.search_settings.default_search_fields
snippet_fields[String]Fields to extract snippet on. Comma-separated list, e.g. "field1,field2"
sort_by[String]Fields to sort the query results on. You can sort by one or two fast fields or by BM25 _score (requires fieldnorms). By default, hits are sorted in reverse order of their document ID (to show recent events first).
formatEnumThe output format. Allowed values are "json" or "pretty_json"pretty_json
aggsJSONThe aggregations request. See the aggregations doc for supported aggregations.
info

The start_timestamp and end_timestamp should be specified in seconds regardless of the timestamp field precision.

Response

The response is a JSON object, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
hitsResults of the query[hit]
num_hitsTotal number of matchesnumber
elapsed_time_microsProcessing time of the querynumber

Search multiple indices

Search APIs that accept index id requests path parameter also support multi-target syntax.

Multi-target syntax

In multi-target syntax, you can use a comma or its URL encoded version '%2C' separated list to run a request on multiple indices: test1,test2,test3. You can also use glob-like wildcard ( * ) expressions to target indices that match a pattern: test* or *test or te*t or *test*.

The following are some constrains about the multi-target expression.

- It must follow the regex `^[a-zA-Z\*][a-zA-Z0-9-_\.\*]{0,254}$`.
- It cannot contain consecutive asterisks (`*`).
- If it contains an asterisk (`*`), the length must be greater than or equal to 3 characters.

Examples

GET api/v1/stackoverflow-000001,stackoverflow-000002/search
{
"query": "search AND engine",
}
GET api/v1/stackoverflow*/search
{
"query": "search AND engine",
}

Search stream in an index

GET api/v1/<index id>/search/stream?query=searchterm&fast_field=my_id

Streams field values from ALL documents matching a search query in the target index <index id>, in a specified output format among the following:

  • CSV
  • ClickHouse RowBinary. If partition_by_field is set, Quickwit returns chunks of data for each partition field value. Each chunk starts with 16 bytes being partition value and content length and then the fast_field values in RowBinary format.

fast_field and partition_by_field must be fast fields of type i64 or u64.

This endpoint is available as long as you have at least one node running a searcher service in the cluster.

note

The endpoint will return 10 million values if 10 million documents match the query. This is expected, this endpoint is made to support queries matching millions of documents and return field values in a reasonable response time.

Path variable

VariableDescription
index idThe index id

Get parameters

VariableTypeDescriptionDefault value
queryStringQuery text. See the query language docrequired
fast_fieldStringName of a field to retrieve from documents. This field must be a fast field of type i64 or u64.required
search_field[String]Fields to search on. Comma-separated list, e.g. "field1,field2"index_config.search_settings.default_search_fields
start_timestampi64If set, restrict search to documents with a timestamp >= start_timestamp. The value must be in seconds.
end_timestampi64If set, restrict search to documents with a timestamp < end_timestamp. The value must be in seconds.
partition_by_fieldStringIf set, the endpoint returns chunks of data for each partition field value. This field must be a fast field of type i64 or u64.
output_formatStringResponse output format. csv or clickHouseRowBinarycsv
info

The start_timestamp and end_timestamp should be specified in seconds regardless of the timestamp field precision.

Response

The response is an HTTP stream. Depending on the client's capability, it is an HTTP1.1 chunked transfer encoded stream or an HTTP2 stream.

It returns a list of all the field values from documents matching the query. The field must be marked as "fast" in the index config for this to work. The formatting is based on the specified output format.

On error, an "X-Stream-Error" header will be sent via the trailers channel with information about the error, and the stream will be closed via sender.abort(). Depending on the client, the trailer header with error details may not be shown. The error will also be logged in quickwit ("Error when streaming search results").

Ingest API

Ingest data into an index

POST api/v1/<index id>/ingest -d \
'{"url":"https://en.wikipedia.org/wiki?id=1","title":"foo","body":"foo"}
{"url":"https://en.wikipedia.org/wiki?id=2","title":"bar","body":"bar"}
{"url":"https://en.wikipedia.org/wiki?id=3","title":"baz","body":"baz"}'

Ingest a batch of documents to make them searchable in a given <index id>. Currently, NDJSON is the only accepted payload format. This endpoint is only available on a node that is running an indexer service.

Newly added documents will not appear in the search results until they are added to a split and that split is committed. This process is automatic and is controlled by split_num_docs_target and commit_timeout_secs parameters. By default, the ingest command exits as soon as the records are added to the indexing queue, which means that the new documents will not appear in the search results at this moment. This behavior can be changed by adding commit=wait_for or commit=force parameters to the query. The wait_for parameter will cause the command to wait for the documents to be committed according to the standard time or number of documents rules. The force parameter will trigger a commit after all documents in the request are processed. It will also wait for this commit to finish before returning. Please note that the force option may have a significant performance cost especially if it is used on small batches.

POST api/v1/<index id>/ingest?commit=wait_for -d \
'{"url":"https://en.wikipedia.org/wiki?id=1","title":"foo","body":"foo"}
{"url":"https://en.wikipedia.org/wiki?id=2","title":"bar","body":"bar"}
{"url":"https://en.wikipedia.org/wiki?id=3","title":"baz","body":"baz"}'
info

The payload size is limited to 10MB as this endpoint is intended to receive documents in batch.

Path variable

VariableDescription
index idThe index id

Query parameters

VariableTypeDescriptionDefault value
commitStringThe commit behavior: auto, wait_for or forceauto

Response

The response is a JSON object, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
num_docs_for_processingTotal number of documents ingested for processing. The documents may not have been processed. The API will not return indexing errors, check the server logs for errors.number

Index API

Create an index

POST api/v1/indexes

Create an index by posting an IndexConfig payload. The API accepts JSON with content-type: application/json and YAML with content-type: application/yaml.

POST payload

VariableTypeDescriptionDefault value
versionStringConfig format version, use the same as your Quickwit version.required
index_idStringIndex ID, see its validation rules on identifiers.required
index_uriStringDefines where the index files are stored. This parameter expects a storage URI.{default_index_root_uri}/{index_id}
doc_mappingDocMappingDoc mapping object as specified in the index config docs.required
indexing_settingsIndexingSettingsIndexing settings object as specified in the index config docs.
search_settingsSearchSettingsSearch settings object as specified in the index config docs.
retentionRetentionRetention policy object as specified in the index config docs.

Payload Example

curl -XPOST http://localhost:7280/api/v1/indexes --data @index_config.json -H "Content-Type: application/json"

"index_config.json
{
"version": "0.8",
"index_id": "hdfs-logs",
"doc_mapping": {
"field_mappings": [
{
"name": "tenant_id",
"type": "u64",
"fast": true
},
{
"name": "app_id",
"type": "u64",
"fast": true
},
{
"name": "timestamp",
"type": "datetime",
"input_formats": ["unix_timestamp"],
"fast_precision": "seconds",
"fast": true
},
{
"name": "body",
"type": "text",
"record": "position"
}
],
"partition_key": "tenant_id",
"max_num_partitions": 200,
"tag_fields": ["tenant_id"],
"timestamp_field": "timestamp"
},
"search_settings": {
"default_search_fields": ["body"]
},
"indexing_settings": {
"merge_policy": {
"type": "limit_merge",
"max_merge_ops": 3,
"merge_factor": 10,
"max_merge_factor": 12
}
},
"retention": {
"period": "7 days",
"schedule": "@daily"
}
}

Response

The response is the index metadata of the created index, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
versionThe current index configuration format version.string
index_uidThe server-generated index UID.string
index_configThe posted index config.IndexConfig
checkpointMap of checkpoints by source.IndexCheckpoint
create_timestampIndex creation timestampnumber
sourcesList of the index sources configurations.Array<SourceConfig>

Update an index

PUT api/v1/indexes/<index id>

Updates the configurations of an index. This endpoint follows PUT semantics, which means that all the fields of the current configuration are replaced by the values specified in this request or the associated defaults. In particular, if the field is optional (e.g. retention_policy), omitting it will delete the associated configuration. If the new configuration file contains updates that cannot be applied, the request fails, and none of the updates are applied. The API accepts JSON with content-type: application/json and YAML with content-type: application/yaml.

  • The retention policy update is automatically picked up by the janitor service on its next state refresh.
  • The search settings update is automatically picked up by searcher nodes when the next query is executed.
  • The indexing settings update is not automatically picked up by the indexer nodes, they need to be manually restarted.
  • The doc mapping update is not automatically picked up by the indexer nodes, they have to be manually restarted.

Updating the doc mapping doesn't reindex existing data. Queries and answers are mapped on a best effort basis when querying older splits. It is also not possible to update the timestamp field, or to modify/remove existing non-default tokenizers (but it is possible to change which tokenizer is used for a field).

PUT payload

VariableTypeDescriptionDefault value
versionStringConfig format version, use the same as your Quickwit version.required
index_idStringIndex ID, must be the same index as in the request URI.required
index_uriStringDefines where the index files are stored. Cannot be updated.{current_index_uri}
doc_mappingDocMappingDoc mapping object as specified in the index config docs.required
indexing_settingsIndexingSettingsIndexing settings object as specified in the index config docs.
search_settingsSearchSettingsSearch settings object as specified in the index config docs.
retentionRetentionRetention policy object as specified in the index config docs.

Payload Example

curl -XPUT http://localhost:7280/api/v1/indexes/hdfs-logs --data @updated_index_update.json -H "Content-Type: application/json"

"updated_index_update.json
{
"version": "0.8",
"index_id": "hdfs-logs",
"doc_mapping": {
"field_mappings": [
{
"name": "tenant_id",
"type": "u64",
"fast": true
},
{
"name": "app_id",
"type": "u64",
"fast": true
},
{
"name": "timestamp",
"type": "datetime",
"input_formats": ["unix_timestamp"],
"fast_precision": "seconds",
"fast": true
},
{
"name": "body",
"type": "text",
"record": "position"
}
],
"partition_key": "tenant_id",
"max_num_partitions": 200,
"tag_fields": ["tenant_id"],
"timestamp_field": "timestamp"
},
"search_settings": {
"default_search_fields": ["body"]
},
"indexing_settings": {
"merge_policy": {
"type": "limit_merge",
"max_merge_ops": 3,
"merge_factor": 10,
"max_merge_factor": 12
}
},
"retention": {
"period": "30 days",
"schedule": "@daily"
}
}

Response

The response is the index metadata of the updated index, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
versionThe current server configuration version.string
index_uidThe server-generated index UID.string
index_configThe posted index config.IndexConfig
checkpointMap of checkpoints by source.IndexCheckpoint
create_timestampIndex creation timestampnumber
sourcesList of the index sources configurations.Array<SourceConfig>

Get an index metadata

GET api/v1/indexes/<index id>

Get the index metadata of ID index id.

Response

The response is the index metadata of the requested index, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
versionThe current server configuration version.string
index_uidThe server-generated index UID.string
index_configThe posted index config.IndexConfig
checkpointMap of checkpoints by source.IndexCheckpoint
create_timestampIndex creation timestamp.number
sourcesList of the index sources configurations.Array<SourceConfig>

Describe an index

GET api/v1/indexes/<index id>/describe

Describes an index of ID index id.

Response

The response is the stats about the requested index, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
index_idIndex ID of index.String
index_uriUri of indexString
num_published_splitsNumber of published splits.number
size_published_splitsSize of published splits.number
num_published_docsNumber of published documents.number
size_published_docs_uncompressedSize of the published documents in bytes (uncompressed).number
timestamp_field_nameName of timestamp field.String
min_timestampStarting time of timestamp.number
max_timestampEnding time of timestamp.number

Get splits

GET api/v1/indexes/<index id>/splits

Get splits belongs to an index of ID index id.

Path variable

VariableDescription
index idThe index id

Get parameters

VariableTypeDescription
offsetnumberIf set, restrict the number of splits to skip
limit numberIf set, restrict maximum number of splits to retrieve
split_statesusizeIf set, specific split state(s) to filter by
start_timestampnumberIf set, restrict splits to documents with a `timestamp >= start_timestamp
end_timestampnumberIf set, restrict splits to documents with a `timestamp < end_timestamp
end_create_timestampnumberIf set, restrict splits whose creation dates are before this date

Response

The response is the stats about the requested index, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
offsetIndex ID of index.String
sizeUri of indexString
splitsNumber of published splits.List

Examples

GET /api/v1/indexes/stackoverflow/splits?offset=0&limit=10
{
"offset": 0,
"size": 1,
"splits": [
{
"split_state": "Published",
"update_timestamp": 1695642901,
"publish_timestamp": 1695642901,
"version": "0.7",
"split_id": "01HB632HD8W6WHNM7CZFH3KG1X",
"index_uid": "stackoverflow:01HB6321TDT3SP58D4EZP14KSX",
"partition_id": 0,
"source_id": "_ingest-api-source",
"node_id": "jerry",
"num_docs": 10000,
"uncompressed_docs_size_in_bytes": 6674940,
"time_range": {
"start": 1217540572,
"end": 1219335682
},
"create_timestamp": 1695642900,
"maturity": {
"type": "immature",
"maturation_period_millis": 172800000
},
"tags": [],
"footer_offsets": {
"start": 4714989,
"end": 4719999
},
"delete_opstamp": 0,
"num_merge_ops": 0
}
]
}

Clears an index

PUT api/v1/indexes/<index id>/clear

Clears index of ID index id: all splits will be deleted (metastore + storage) and all source checkpoints will be reset.

It returns an empty body.

Delete an index

DELETE api/v1/indexes/<index id>

Delete index of ID index id.

Response

The response is the list of deleted split files; the content type is application/json; charset=UTF-8.

[
{
"split_id": "01GK1XNAECH7P14850S9VV6P94",
"num_docs": 1337,
"uncompressed_docs_size_bytes": 23933408,
"file_name": "01GK1XNAECH7P14850S9VV6P94.split",
"file_size_bytes": 2991676
}
]

Get all indexes metadata

GET api/v1/indexes

Retrieve the metadata of all indexes present in the metastore.

Response

The response is an array of IndexMetadata, and the content type is application/json; charset=UTF-8.

Create a source

POST api/v1/indexes/<index id>/sources

Create source by posting a source config JSON payload.

POST payload

VariableTypeDescriptionDefault value
`version**StringConfig format version, put your current Quickwit version.required
source_idStringSource ID. See ID validation rules.required
source_typeStringSource type: kafka, kinesis or pulsar.required
num_pipelinesusizeNumber of running indexing pipelines per node for this source.1
paramsobjectSource parameters as defined in source config docs.required

Payload Example

curl -XPOST http://localhost:7280/api/v1/indexes/my-index/sources --data @source_config.json -H "Content-Type: application/json"

"source_config.json
{
"version": "0.8",
"source_id": "kafka-source",
"source_type": "kafka",
"params": {
"topic": "quickwit-fts-staging",
"client_params": {
"bootstrap.servers": "kafka-quickwit-server:9092"
}
}
}

Response

The response is the created source config, and the content type is application/json; charset=UTF-8.

Toggle source

PUT api/v1/indexes/<index id>/sources/<source id>/toggle

Toggle (enable/disable) source source id of index ID index id.

It returns an empty body.

PUT payload

VariableTypeDescription
enableboolIf true enable the source, else disable it.

Reset source checkpoint

PUT api/v1/indexes/<index id>/sources/<source id>/reset-checkpoint

Resets checkpoints of source source id of index ID index id.

It returns an empty body.

Delete a source

DELETE api/v1/indexes/<index id>/sources/<source id>

Delete source of ID <source id>.

Cluster API

This endpoint lets you check the state of the cluster from the point of view of the node handling the request.

GET api/v1/cluster?format=pretty_json

Parameters

NameTypeDescriptionDefault value
formatStringThe output format requested for the response: json or pretty_jsonpretty_json

Delete API

The delete API enables to delete documents matching a query.

Create a delete task

POST api/v1/<index id>/delete-tasks

Create a delete task that will delete all documents matching the provided query in the given index <index id>. The endpoint simply appends your delete task to the delete task queue in the metastore. The deletion will eventually be executed.

Path variable

VariableDescription
index idThe index id

POST payload DeleteQuery

VariableTypeDescriptionDefault value
queryStringQuery text. See the query language docrequired
search_field[String]Fields to search on. Comma-separated list, e.g. "field1,field2"index_config.search_settings.default_search_fields
start_timestampi64If set, restrict search to documents with a timestamp >= start_timestamp. The value must be in seconds.
end_timestampi64If set, restrict search to documents with a timestamp < end_timestamp. The value must be in seconds.

Example

{
"query": "body:trash",
"start_timestamp": "1669738645",
"end_timestamp": "1669825046",
}

Response

The response is the created delete task represented in JSON, DeleteTask, the content type is application/json; charset=UTF-8.

FieldDescriptionType
create_timestampCreate timestamp of the delete query in secondsi64
opstampUnique operation stamp associated with the delete tasku64
delete_queryThe posted delete queryDeleteQuery

List delete queries

GET api/v1/<index id>/delete-tasks

Get the list of delete tasks for a given index_id.

Response

The response is an array of DeleteTask.