RethinkDB 1.8: date support, nested object syntax, 8x disk usage improvement

We are happy to announce RethinkDB 1.8 (High Noon). Download it now!

This release includes the following features and improvements:

  • ReQL improvements
    • Dates and times are natively supported, and there are over 20 new commands for creating, manipulating, and querying them.
    • Nested documents are easier to work with in update and merge. (We also extended support for pluck’s nested document syntax to without, group_by, with_fields, and has_fields.)
  • Server improvements
    • Server uses 8x less disk space.
    • Tables support efficient ordering by primary or secondary indexes.
    • Javascript evaluation is more efficient and reliable.
    • Cross-network cluster operation is much easier.

See the full list of over 90 bug fixes, features, and enhancements.

Michael Lucy (@mlucy), an engineer at RethinkDB, introduces these new features in this two-minute video:

Upgrading to RethinkDB 1.8? Make sure to migrate your data before upgrading to RethinkDB 1.8.

Dates and times

Users have been asking for native date-time support since the very first public versions of RethinkDB. It’s been a long time coming, but we’re pretty proud of what we’ve put together.

You can insert native time objects in whatever language you’re using:

r.table('events').insert({'id' => 0, 'time' =>}).run(conn)

You can then ask timezone-aware questions about them, like “what hour did this occur in the event’s time zone?” Since this is written in pure ReQL, even complicated operations involving times can be distributed efficiently over the cluster, or used as the function for a secondary index.

> r.table('events').get(0)['time'].hours().run(conn)

> r.table('events').filter{|row| row['time'].hours().eq(16)}.run(conn).to_a
[{"time"=>2013-08-14 16:27:19 -0700, "id"=>0}]

Note that the time was converted back to a native time object in the client:

> x = r.table('events').get(0).run(conn)
{"time"=>2013-08-14 16:27:19 -0700, "id"=>0}

> x['time'].class

There’s a lot more you can do — and, despite our best efforts, there are also a few gotchas that inevitably crop up around dates and times. Head over to the dates and times documentation to see more cool features, or to learn about the nitty-gritty problems like precision, leap seconds, and language support for time zones.

Nested documents

We’ve heard over and over again from users that update is a pleasure to use, until you need to modify a nested document, at which point doing anything useful becomes a herculean task. As of 1.8, update now has full support for nested documents. Let’s say you have a table nested_users that looks like this:

> r.table('nested_users').run(conn).to_a
[{"id"=>0, "data"=>{"name"=>"Alice", "age"=>18}}]

The new update and merge commands recurse down into sub-objects, so that if you want to update Alice’s age in nested_users, you can now write:

NEW> r.table('nested_users').get(0).update({'data' => {'age' => 19}}).run(conn)
{"unchanged"=>0, "skipped"=>0, "replaced"=>1, "inserted"=>0, "errors"=>0, "deleted"=>0}

NEW> r.table('nested_users').run(conn).to_a
[{"id"=>0, "data"=>{"name"=>"Alice", "age"=>19}}]

Note: If your existing application uses update and merge and expects it to be shallow, this may be a breaking change for your application.

If you want the old behavior, we’ve introduced a new term r.literal which means “I want literally this object, not the result of merging the old object with this one.”:

NEW> r.table('nested_users').get(0).update(
       {'data' => r.literal({'age' => 19})}).run(conn)
{"unchanged"=>0, "skipped"=>0, "replaced"=>1, "inserted"=>0, "errors"=>0, "deleted"=>0}

NEW> r.table('nested_users').run(conn).to_a
[{"id"=>0, "data"=>{"age"=>19}}]

8x disk space utilization improvement

Prior to this release the RethinkDB disk serializer supported a fixed block size. This resulted in the following space utilization behavior:

  • Small documents (<= 250 bytes) were packed tightly in the B-Tree and did not waste disk space.
  • Larger documents (> 250 bytes) were aligned to 4KB blocks, wasting as much as 800% of disk space in common use cases.

As of the 1.8 release the serializer now supports variable-sized block sizes, and the large value code takes full advantage of this functionality. If you’re using large documents, you should see up to 8x disk space utilization improvement on many insert workloads.

Looking forward to the 1.9 release

Now that the 1.8 release is out, the team is focused entirely on getting RethinkDB to a production-ready version. 1.8 is the last release to include major new features prior to the production-ready release (RethinkDB 2.0). We’re now hard at work on the 1.9 release, which will mostly contain performance and scalability improvements, stability fixes, and architectural improvements in preparation for production readiness.

Look out for a blog post about the road to RethinkDB 2.0 in the next few days. In the meantime, if you have comments about the roadmap, we’d love to hear from you)!