Quickstart
In this quick start guide, we will install Quickwit, create an index, add documents and finally execute search queries. All the Quickwit commands used in this guide are documented in the CLI reference documentation.
Install Quickwit using Quickwit installer
The Quickwit installer automatically picks the correct binary archive for your environment and then downloads and unpacks it in your working directory. This method works only for some OS/architectures, and you will also need to install some external dependencies.
curl -L https://install.quickwit.io | sh
cd ./quickwit-v*/
./quickwit --version
You can now move this executable directory wherever sensible for your environment and possibly add it to your PATH
environment.
Use Quickwit's Docker image
You can also pull and run the Quickwit binary in an isolated Docker container.
# Create first the data directory.
mkdir qwdata
docker run --rm quickwit/quickwit --version
If you are using Apple silicon based macOS system you might need to specify the platform. You can also safely ignore jemalloc warnings.
docker run --rm --platform linux/amd64 quickwit/quickwit --version
Start Quickwit server
- CLI
- Docker
./quickwit run
docker run --rm -v $(pwd)/qwdata:/quickwit/qwdata -p 127.0.0.1:7280:7280 quickwit/quickwit run
Check it's working by browsing the UI at http://localhost:7280 or do a simple GET with cURL:
curl http://localhost:7280/api/v1/version
Create your first index
Before adding documents to Quickwit, you need to create an index configured with a YAML config file. This config file notably lets you define how to map your input documents to your index fields and whether these fields should be stored and indexed. See the index config documentation.
Let's create an index configured to receive Stackoverflow posts (questions and answers).
# First, download the stackoverflow dataset config from Quickwit repository.
curl -o stackoverflow-index-config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/v0.6.4/config/tutorials/stackoverflow/index-config.yaml
The index config defines nine text fields. Among them there are five text fields: user
, tags
, title
, type
and body
. Two of these fields, body
and title
are indexed and tokenized and they are also used as default search fields, which means they will be used for search if you do not target a specific field in your query. The tags
field is configured to accept multiple text values. The rest of the text fields are not tokenized and configured as fast. There are three numeric fields questionId
, answerId
and acceptedAnswerId
. And there is the creationDate
field that serves as the timestamp for each record.
And here is the complete config:
#
# Index config file for stackoverflow dataset.
#
version: 0.6
index_id: stackoverflow
doc_mapping:
field_mappings:
- name: user
type: text
fast: true
tokenizer: raw
- name: tags
type: array<text>
fast: true
tokenizer: raw
- name: type
type: text
fast: true
tokenizer: raw
- name: title
type: text
tokenizer: default
record: position
stored: true
- name: body
type: text
tokenizer: default
record: position
stored: true
- name: questionId
type: u64
- name: answerId
type: u64
- name: acceptedAnswerId
type: u64
- name: creationDate
type: datetime
fast: true
input_formats:
- rfc3339
precision: seconds
timestamp_field: creationDate
search_settings:
default_search_fields: [title, body]
indexing_settings:
commit_timeout_secs: 5
Now we can create the index with the command:
- CLI
- CURL
./quickwit index create --index-config ./stackoverflow-index-config.yaml
curl -XPOST http://127.0.0.1:7280/api/v1/indexes --header "content-type: application/yaml" --data-binary @./stackoverflow-index-config.yaml
Check that a directory ./qwdata/indexes/stackoverflow
has been created, Quickwit will write index files here and a metastore.json
which contains the index metadata.
You're now ready to fill the index.
Let's add some documents
Quickwit can index data from many sources. We will use a new line delimited json ndjson datasets as our data source. Let's download a bunch of stackoverflow posts (10 000) in ndjson format and index it.
# Download the first 10_000 Stackoverflow posts articles.
curl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json
- CLI
- CURL
# Index our 10k documents.
./quickwit index ingest --index stackoverflow --input-path stackoverflow.posts.transformed-10000.json --force
# Index our 10k documents.
curl -XPOST "http://127.0.0.1:7280/api/v1/stackoverflow/ingest?commit=force" --data-binary @stackoverflow.posts.transformed-10000.json
As soon as the ingest command finishes you can start querying data by using the following search
command:
- CLI
- CURL
./quickwit index search --index stackoverflow --query "search AND engine"
curl "http://127.0.0.1:7280/api/v1/stackoverflow/search?query=search+AND+engine"
It should return 10 hits. Now you're ready to play with the search API.
Execute search queries
Let's start with a query on the field title
: title:search AND engine
:
curl "http://127.0.0.1:7280/api/v1/stackoverflow/search?query=title:search+AND+engine"
The same request can be expressed as a JSON query:
curl -XPOST "http://localhost:7280/api/v1/stackoverflow/search" -H 'Content-Type: application/json' -d '{
"query": "title:search AND engine"
}'
This format is more verbose but it allows you to use more advanced features such as aggregations. The following query finds most popular tags used on the questions in this dataset:
curl -XPOST "http://localhost:7280/api/v1/stackoverflow/search" -H 'Content-Type: application/json' -d '{
"query": "type:question",
"max_hits": 0,
"aggs": {
"foo": {
"terms":{
"field":"tags",
"size": 10
}
}
}
}'
As you are experimenting with different queries check out the server logs to see what's happening.
Don't forget to encode correctly the query params to avoid bad request (status 400).
Clean
Let's do some cleanup by deleting the index:
- CLI
- REST
./quickwit index delete --index stackoverflow
curl -XDELETE http://127.0.0.1:7280/api/v1/indexes/stackoverflow
Congrats! You can level up with the following tutorials to discover all Quickwit features.
TLDR
Run the following command from within Quickwit's installation directory.
curl -o stackoverflow-index-config.yaml https://raw.githubusercontent.com/quickwit-oss/quickwit/v0.6.4/config/tutorials/stackoverflow/index-config.yaml
./quickwit index create --index-config ./stackoverflow-index-config.yaml
curl -O https://quickwit-datasets-public.s3.amazonaws.com/stackoverflow.posts.transformed-10000.json
./quickwit index ingest --index stackoverflow --input-path ./stackoverflow.posts.transformed-10000.json --force
./quickwit index search --index stackoverflow --query "search AND engine"
./quickwit index delete --index stackoverflow