English-language Wikipedia:Database last page revisions
This use case generates 1 million docs, with 3 TEXT fields (all sortable), 1 sortable TAG field, and 1 sortable NUMERIC fields per document.
We’ve targeted large documents, with an average size of 45KB, and single documents that can reach 300KB.
Query type | Description | Example | Status |
---|---|---|---|
simple-1word-query | Simple 1 Word Query | @timestamp:[1604149522.5400035 1604270992.0] Abraham |
✔️ |
2word-union-query | 2 Word Union Query | @timestamp:[1604149522.5400035 1604270992.0] Abraham Lincoln |
✔️ |
2word-intersection-query | 2 Word Intersection Query | @timestamp:[1604149522.5400035 1604270992.0] Abraham |Lincoln |
✔️ |
Using FTSB for benchmarking involves 2 phases: data and query generation, and query execution.
The following steps focus on how to retrieve the data and generate the commands for the nyc_taxis use case.
To generate the required dataset command file issue:
cd $GOPATH/src/github.com/RediSearch/ftsb/scripts/datagen_redisearch/enwiki_pages
python3 ftsb_generate_enwiki_pages.py
The use case generates an secondary index with with 3 TEXT fields (all sortable), 1 sortable TAG field, and 1 sortable NUMERIC fields per document.
Assuming you have redisbench-admin
and ftsb_redisearch
installed, for the default dataset, run:
redisbench-admin run \
--repetitions 3 \
--benchmark-config-file https://s3.amazonaws.com/benchmarks.redislabs/redisearch/datasets/enwiki_pages-hashes/enwiki_pages-hashes.redisearch.cfg.json
After running the benchmark you should have a result json file generated, containing key information about the benchmark run(s). Focusing specifically on this benchmark the following metrics should be taken into account and will be used to automatically choose the best run and assess results variance, ordered by the following priority ( in case of results comparison ):
Metric Family | Metric Name | Unit | Comparison mode |
---|---|---|---|
Throughput | Overall Ingestion rate | docs/sec | higher is better |
Latency | Overall ingestion p50 | milliseconds | lower is better |
Metric Family | Metric Name | Unit | Comparison mode |
---|---|---|---|
Throughput | Overall Updates and Aggregates query rate | docs/sec | Higher is better |
Latency | Overall Updates and Aggregates query q50 latency | milliseconds | Lower is better |