RethinkDB 1.11: query profiler, new streaming algorithm, devops enhancements
The 1.11 release features more than 70 enhancements, including:
- A new query profiler to analyze the performance of ReQL queries.
- An improved streaming algorithm that reduces query latency.
- DevOps enhancements, including new ReQL commands designed for operations
Upgrading to RethinkDB 1.11? Make sure to migrate your data before upgrading to RethinkDB 1.11.
Prior to RethinkDB 1.11, there was no way to analyze the performance of queries. Optimizing queries was a process of trial and error. As of the 1.11 release, RethinkDB includes a developer preview of a query profiler that will make this process a lot easier.
You can enable the query profiler on a given query by passing the
profile=True option to
run, or by using the Profile tab in the Data
When you run a query with profiling, we return the query’s result, along with a trace of its execution.
"description": "Evaluating sample.",
"description": "Evaluating datum.",
"description": "Evaluating table.",
"description": "Sampling elements.",
The trace includes a breakdown of operations performed on the cluster, the time for each operation, and information about which parts of the query were performed in parallel.
The query profiler is included in this release as a developer preview, so it is limited in scope. Coming releases will add important metrics like memory and disk usage, and will improve readability for complex ReQL commands.
One of the goals of the 1.11 release was to reduce query latency. Much work has gone into identifying, understanding, and removing the sources of slowdowns. To better understand the behavior of the system under load we ran dozens of benchmarks, implemented a new coroutine profiler, and expanded backtraces to span over coroutine boundaries.
We were able to greatly improve the responsiveness of the server in many situations, such as while creating new indexes or during periods of high cluster traffic. Below are some of the important changes we made:
New streaming algorithm
Prior to version 1.11, RethinkDB used a fixed batch size for all types of documents. The batch size was fixed at 1MB for communication between the nodes in the cluster, and 1000 documents between the the server and the client. This implementation skewed the system towards high throughput at a significant cost to realtime latency.
In the 1.11 release, the batching is significantly improved. The new batching algorithm adjusts the size of each batch dependening on the document size and the query latency. In practice, this results in significantly speedups. Latency for queries returning very large documents is often reduced by more than 100x.
Improved write operations
In addition to the new streaming algorithm, RethinkDB 1.11 includes many other changes that improve performance for various workloads.
- In previous versions, every write transaction caused at least three separate disk writes. For most transactions, we reduced the number of writes to two.
- We added a new algorithm that merges certain separate writes (such as index writes) into a single operation. The algorithm is designed to improve the performance of RethinkDB on rotational drives, but it also improves latency and throughput on SSDs.
- At the disk level, we introduced more parallelism that allows RethinkDB to read more data at once from the disk.
One of the immediate goals for the development team is to improve the experience of running RethinkDB on live deployments. Our work toward this goal is guided by two simple principles:
- The administrator should always be able to answer any question about the state of the cluster.
- No single query or set of queries should be able to monopolize cluster resources.
We’ve added the following enhancements to the 1.11 release in pursuit of this goal.
Determining secondary index status
As of the 1.11 release, you no longer have to guess whether a newly created
secondary index is ready for use. We added two new commands for observing the
status of index creation:
indexWait. As the names suggest,
indexStatus allows determining the status of a newly created secondary index,
indexWait allows the client to wait until the secondary index is
successfully created. See the API reference for more details.
Better control over soft durability
Prior to the 1.11 release users who chose to use soft durability (off by
default, of course), had no way to ensure that the data they’d inserted in soft
durability mode had been committed to disk. We’ve now added a new
command for flushing soft durability writes to disk. Calling the
ensures that any data written in soft durability mode on a given table has been
flushed to disk before the command returns. See the API reference for
Getting to an LTS release
We’re still hard at work on our first LTS release, due in early 2014. In pursuit of that, our next few releases will continue to focus on performance and stability.
As previously mentioned, here is how the LTS release will be different from the beta releases we’ve been shipping until now:
- It will go through a longer QA process that will include rigorous automated and manual testing
- All known high impact bugs and performance issues will be solved
- We’ll publish results of our tests on high demand, large scale workloads in a clustered environment
- The LTS release will have an additional margin of safety since it will first be battle tested by our pilot customers
- We will be offering commercial support, training, and consulting options
If you have feedback or questions about this process, we’d love to hear from you!
Help work on the 1.13 release: RethinkDB is hiring.