Skip to main content
Version: 0.4.0

REST API

API version

All the API endpoints start with the api/v1/ prefix. v1 indicates that we are currently using version 1 of the API.

Parameters

Parameters passed in the URL must be properly URL-encoded, using the UTF-8 encoding for non-ASCII characters.

GET [..]/search?query=barack%20obama

Error handling

Successful requests return a 2xx HTTP status code.

Failed requests return a 4xx HTTP status code. The response body of failed requests holds a JSON object containing an error_message field that describes the error.

{
"error_message": "Failed to parse query"
}

Search API

Search in an index

Search for documents matching a query in the given index api/v1/<index id>/search. This endpoint is available as long as you have at least one node running a searcher service in the cluster. The search endpoint accepts GET and POST requests. The parameters are URL parameters in case of GET or JSON key value pairs in case of POST.

GET api/v1/<index id>/search?query=searchterm
POST api/v1/<index id>/search
{
"query": searchterm
}

Path variable

VariableDescription
index idThe index id

Parameters

VariableTypeDescriptionDefault value
queryStringQuery text. See the query language doc (mandatory)
start_timestampi64If set, restrict search to documents with a timestamp >= start_timestamp. The value must be in seconds.
end_timestampi64If set, restrict search to documents with a timestamp < end_timestamp. The value must be in seconds.
start_offsetIntegerNumber of documents to skip0
max_hitsIntegerMaximum number of hits to return (by default 20)20
search_field[String]Fields to search on if no field name is specified in the query. Comma-separated list, e.g. "field1,field2"index_config.search_settings.default_search_fields
snippet_fields[String]Fields to extract snippet on. Comma-separated list, e.g. "field1,field2"
sort_by_fieldStringField to sort query results by. You can sort by a field (must be a fast field) and by BM25 _score. By default, hits are sorted by their document ID.
formatEnumThe output format. Allowed values are "json" or "prettyjson"prettyjson
aggsJSONThe aggregations request. See the aggregations doc for supported aggregations.
info

The start_timestamp and end_timestamp should be specified in seconds regardless of the timestamp field precision.

Response

The response is a JSON object, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
hitsResults of the query[hit]
num_hitsTotal number of matchesnumber
elapsed_time_microsProcessing time of the querynumber

Search stream in an index

GET api/v1/<index id>/search/stream?query=searchterm

Streams field values from ALL documents matching a search query in the given index <index id>, in a specified output format among the following:

  • CSV
  • ClickHouse RowBinary. If partition_by_field is set, Quickwit returns chunks of data for a each partition field value. Each chunk starts with 16 bytes being partition value and content length and then the fast_field values in RowBinary format.

fast_field and partition_by_field must be fast fields of type i64 or u64.

This endpoint is available as long as you have at least one node running a searcher service in the cluster.

note

The endpoint will return 10 million values if 10 million documents match the query. This is expected, this endpoint is made to support queries matching millions of document and return field values in a reasonable response time.

Path variable

VariableDescription
index idThe index id

Get parameters

VariableTypeDescriptionDefault value
queryStringQuery text. See the query language doc (mandatory)
fast_fieldStringName of a field to retrieve from documents. This field must be a fast field of type i64 or u64. (mandatory)
search_field[String]Fields to search on. Comma-separated list, e.g. "field1,field2"index_config.search_settings.default_search_fields
start_timestampi64If set, restrict search to documents with a timestamp >= start_timestamp. The value must be in seconds.
end_timestampi64If set, restrict search to documents with a timestamp < end_timestamp. The value must be in seconds.
partition_by_fieldStringIf set, the endpoint returns chunks of data for each partition field value. This field must be a fast field of type i64 or u64.

| output_format | String | Response output format. csv or clickHouseRowBinary | csv |

info

The start_timestamp and end_timestamp should be specified in seconds regardless of the timestamp field precision.

Response

The response is an HTTP stream. Depending on the client's capability, it is an HTTP1.1 chunked transfer encoded stream or an HTTP2 stream.

It returns a list of all the field values from documents matching the query. The field must be marked as "fast" in the index config for this to work. The formatting is based on the specified output format.

On error, an "X-Stream-Error" header will be sent via the trailers channel with information about the error, and the stream will be closed via sender.abort(). Depending on the client, the trailer header with error details may not be shown. The error will also be logged in quickwit ("Error when streaming search results").

Ingest data into an index

POST api/v1/<index id>/ingest -d \
'{"url":"https://en.wikipedia.org/wiki?id=1","title":"foo","body":"foo"}
{"url":"https://en.wikipedia.org/wiki?id=2","title":"bar","body":"bar"}
{"url":"https://en.wikipedia.org/wiki?id=3","title":"baz","body":"baz"}'

Ingest a batch of documents to make them searchable in a given <index id>. Currently, NDJSON is the only accepted payload format.This endpoint is only available on a node that is running an indexer service.

info

The payload size is limited to 10MB as this endpoint is intended to receive documents in batch.

Path variable

VariableDescription
index idThe index id

Response

The response is a JSON object, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
num_docs_for_processingTotal number of documents ingested for processing. The documents may not have been processed. The API will not return indexing errors, check the server logs for errors.number

Ingest data with Elasticsearch compatible API

POST api/v1/_bulk -d \
'{ "create" : { "_index" : "wikipedia", "_id" : "1" } }
{"url":"https://en.wikipedia.org/wiki?id=1","title":"foo","body":"foo"}
{ "create" : { "_index" : "wikipedia", "_id" : "2" } }
{"url":"https://en.wikipedia.org/wiki?id=2","title":"bar","body":"bar"}
{ "create" : { "_index" : "wikipedia", "_id" : "3" } }
{"url":"https://en.wikipedia.org/wiki?id=3","title":"baz","body":"baz"}'

Ingest a batch of documents to make them searchable using the Elasticsearch bulk API. This endpoint provides compatibility with tools or systems that already send data to Elasticsearch for indexing. Currently, only the create action of the bulk API is supported, all other actions such as delete or update are ignored.

caution

The quickwit API will not report errors, you need to check the server logs.

In Elasticsearch, the create action has a specific behavior when the ingest documents contain an identifier (the _id field). It only inserts such a document if it was not inserted before. This is extremely handy to achieve At-Most-Once indexing. Quickwit does not have any notion of document id and does not support this feature.

info

The payload size is limited to 10MB as this endpoint is intended to receive documents in batch.

Response

The response is a JSON object, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
num_docs_for_processingTotal number of documents ingested for processing. The documents may not have been processed. The API will not return indexing errors, check the server logs for errors.number

Index management API

Create an index

POST api/v1/indexes

Create an index by posting an IndexConfig JSON payload.

POST payload

VariableTypeDescriptionDefault value
versionStringConfig format version, use the same as your Quickwit version. (mandatory)
index_idStringIndex ID, see its validation rules on identifiers. (mandatory)
index_uriStringDefines where the index files are stored. This parameter expects a storage URI.{default_index_root_uri}/{index_id}
doc_mappingDocMappingDoc mapping object as specified in the index config docs (mandatory)
indexing_settingsIndexingSettingsIndexing settings object as specified in the index config docs.
search_settingsSearchSettingsSearch settings object as specified in the index config docs.
retentionRetentionRetention policy object as specified in the index config docs.

Payload Example

curl -XPOST http://0.0.0.0:8080/api/v1/indexes --data @index_config.json -H "Content-Type: application/json"

"index_config.json
{
"version": "0.4",
"index_id": "hdfs-logs",
"doc_mapping": {
"field_mappings": [
{
"name": "tenant_id",
"type": "u64",
"fast": true
},
{
"name": "app_id",
"type": "u64",
"fast": true
},
{
"name": "timestamp",
"type": "datetime",
"input_formats": ["unix_timestamp"],
"precision": "seconds",
"fast": true
},
{
"name": "body",
"type": "text",
"record": "position"
}
],
"partition_key": "tenant_id",
"max_num_partitions": 200,
"tag_fields": ["tenant_id"],
"timestamp_field": "timestamp"
},
"search_settings": {
"default_search_fields": ["body"]
},
"indexing_settings": {
"merge_policy": {
"type": "limit_merge",
"max_merge_ops": 3,
"merge_factor": 10,
"max_merge_factor": 12
},
"resources": {
"max_merge_write_throughput": "80mb"
}
},
"retention": {
"period": "7 days",
"schedule": "@daily"
}
}

Response

The response is the index metadata of the created index, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
index_configThe posted index config.IndexConfig
checkpointMap of checkpoints by source.IndexCheckpoint
create_timestampIndex creation timestampnumber
sourcesList of the index sources configurations.Array<SourceConfig>

Get an index metadata

GET api/v1/indexes/<index id>

Get the index metadata of ID index id.

Response

The response is the index metadata of the requested index, and the content type is application/json; charset=UTF-8.

FieldDescriptionType
index_configThe posted index config.IndexConfig
checkpointMap of checkpoints by source.IndexCheckpoint
create_timestampIndex creation timestamp.number
sourcesList of the index sources configurations.Array<SourceConfig>

Delete an index

DELETE api/v1/indexes/<index id>

Delete index of ID index id.

Response

The response is the list of delete split files, and the content type is application/json; charset=UTF-8.

[
{
"file_name": "01GK1XNAECH7P14850S9VV6P94.split",
"file_size_in_bytes": 2991676
}
]

Get all indexes metadatas

GET api/v1/indexes

Get the indexes metadatas of all indexes present in the metastore.

Response

The response is an array of IndexMetadata, and the content type is application/json; charset=UTF-8.

Create a source

POST api/v1/indexes/<index id>/sources

Create source by posting a source config JSON payload.

POST payload

VariableTypeDescriptionDefault value
versionStringConfig format version, put your current Quickwit version. (mandatory)
source_idStringSource ID. See ID validation rules(mandatory)
source_typeStringSource type: kafka, kinesis, file. (mandatory)
num_pipelinesusizeNumber of running indexing pipelines per node for this source.1
paramsobjectSource parameters as defined in source config docs. (mandatory)

Payload Example

curl -XPOST http://0.0.0.0:8080/api/v1/indexes/my-index/sources --data @source_config.json -H "Content-Type: application/json"

"source_config.json
{
"version": "0.4",
"source_id": "kafka-source",
"source_type": "kafka",
"params": {
"topic": "quickwit-fts-staging",
"client_params": {
"bootstrap.servers": "kafka-quickwit-server:9092"
}
}
}

Response

The response is the created source config, and the content type is application/json; charset=UTF-8.

Delete a source

DELETE api/v1/indexes/<index id>/sources/<source id>

Delete source of ID <source id>.

Cluster API

This endpoint lets you check the state of the cluster from the point of view of the node handling the request.

GET api/v1/cluster?format=prettyjson

Parameters

NameTypeDescriptionDefault value
formatStringThe output format requested for the response: json or prettyjsonprettyjson

Delete API

The delete API enables to delete documents matching a query.

Create a delete task

POST api/v1/<index id>/delete-tasks

Create a delete task that will delete all documents matching the provided query in the given index <index id>. The endpoint simply appends your delete task to the delete task queue in the metastore. The deletion will eventually be executed.

Path variable

VariableDescription
index idThe index id

POST payload DeleteQuery

VariableTypeDescriptionDefault value
queryStringQuery text. See the query language doc (mandatory)
search_field[String]Fields to search on. Comma-separated list, e.g. "field1,field2"index_config.search_settings.default_search_fields
start_timestampi64If set, restrict search to documents with a timestamp >= start_timestamp. The value must be in seconds.
end_timestampi64If set, restrict search to documents with a timestamp < end_timestamp. The value must be in seconds.

Example

{
"query": "body:trash",
"start_timestamp": "1669738645",
"end_timestamp": "1669825046",
}

Response

The response is the created delete task represented in JSON, DeleteTask, the content type is application/json; charset=UTF-8.

FieldDescriptionType
create_timestampCreate timestamp of the delete query in secondsi64
opstampUnique operation stamp associated with the delete tasku64
delete_queryThe posted delete queryDeleteQuery

GET a delete query

GET api/v1/<index id>/delete-tasks/<opstamp>

Get the delete task of operation stamp opstamp for a given index_id.

Response

The response is a DeleteTask.