RethinkDB 1.13: pull data via HTTP, push data via changefeeds
Today, we’re happy to announce RethinkDB 1.13 (My Name is Nobody). Download it now!
The 1.13 release includes over 150 enhancements, including:
- New http command for seamlessly pulling data from external APIs into RethinkDB
- New changes command for subscribing to document changes on tables
- Full promises support in the JavaScript driver
- A high performance JSON driver protocol
- Dozens of performance and stability improvements
Upgrading to RethinkDB 1.13? Make sure to migrate your data before upgrading to RethinkDB 1.13.
Upgrading on Ubuntu? We’ve moved to our own PPA, so please add the RethinkDB PPA to upgrade.
Pull data via HTTP
Since many APIs accept and return JSON, RethinkDB is a convenient platform for
manipulating and analyzing API data. In this release we’ve added a new http
command to make this process even easier (see the API reference and
the tutorial). You can now access external APIs directly from the
database with a clean and seamless experience!
For example, let’s use the GitHub API to get the first ten pages of users who starred the RethinkDB GitHub repository:
r.http('https://api.github.com/repos/rethinkdb/rethinkdb/stargazers',
page='link-next', pageLimit=10)
The http
command returns a JSON stream, just like any other command in ReQL:
# Count the number of values returned by the GitHub API. Pagination is
# off by default, so we're only getting the first page of users.
r.http('https://api.github.com/repos/rethinkdb/rethinkdb/stargazers')
.count()
# Grab the login and user ID, and then sort by ID
r.http('https://api.github.com/repos/rethinkdb/rethinkdb/stargazers')
.pluck('login', 'id').orderBy('id')
# Store the results in a table
r.table('stargazers')
.insert(r.http('https://api.github.com/repos/rethinkdb/rethinkdb/stargazers'))
You can tack on additional ReQL commands just like you would with any other
query, store the results in a table, make additional HTTP API calls to pull in
more data for each document, control API pagination, and much more! See the
API reference and the tutorial for the http
command
for more details and examples.
Push data via changefeeds
Over the last few months we had many requests to make RethinkDB integration
with other systems easier. We’ve now added a new changes
command (see the
API reference and the tutorial). Any time a
document in the table is inserted, updated, or deleted, the client driver can
get notified about the change. Changefeeds offer a convenient way to perform
certain tasks:
- Integrate with other databases or middleware such as ElasticSearch or RabbitMQ.
- Write applications where clients are notified of changes in realtime.
The changes
command returns a stream of changes in a regular cursor, and is
very powerful and easy to use:
feed = r.table('users').changes().run(conn)
for change in feed:
print change
Every time you insert, update, or delete a document in a table, an object
describing the change will be added to relevant changefeeds. For example, if
you insert a user { 'id': 1, 'name': 'Slava', 'age': 31 }
into the users
table, RethinkDB will post the following document into the feeds subscribed to
users
:
{
'old_val': None,
'new_val': { 'id': 1, 'name': 'Slava', 'age': 31 }
}
Here old_val
is the old version of the document, and new_val
is a new
version of the document. Because changes
returns a regular stream, you can
tack on RethinkDB queries to do transformations or filter for specific changes:
# Only get changes where a user's age increases
r.table('users').changes().filter(
lambda change: change['new_val']['age'] > change['old_val']['age']
).run(conn)
See the API reference and the tutorial for the
changes
command for more details and examples.
Support for promises in the JavaScript driver
As of this release the RethinkDB JavaScript driver has full support for promises. If you take advantage of promises, new code that interacts with the database can be much cleaner and more convenient.
Here is an example of old JavaScript code to connect to the database:
r.table('posts').run(connection, function(err, cursor) {
if (err) return console.log(err);
cursor.toArray(function(err, results) {
if (err) return console.log(err);
console.log(results);
})
}
In the new 1.13 release this code will continue to work, but you can also rewrite it to take advantage of promises:
r.table('posts').run(connection).then(function(cursor) {
return cursor.toArray();
}).then(function(results) {
console.log(results);
}).error(console.log);
See the API documentation for connect and next for more details.
JSON driver protocol
Traditionally RethinkDB has used Protocol Buffers to communicate between the drivers and the database server. As of this release, we’ve added a native JSON driver protocol, and migrated the official drivers to the new implementation.
This change has the following advantages:
- Almost every language has a well-supported JSON library, but there are still many languages whose protocol buffer implementations have quality and performance issues.
- RethinkDB drivers can now be written in languages that don’t have a good Protocol Buffers port (e.g. Python 3).
- For deeply nested objects, the new serialization protocol can be more efficient in terms of CPU utilization and network traffic.
- The driver installation process no longer requires special steps for a fast native backend.
The server still has full support for the Protocol Buffer interface, so community drivers will continue to work without interruption.
If you’re a driver developer, check out the new specification for details and hop on the driver developers group with any questions!
Next steps
See the full list of enhancements, and take the new release for a spin!
The team is already hard at work on the upcoming 1.14 release that will likely include support for binary data, geospacial indexing, and cluster administration and monitoring API. As always, if there is something you’d like us to prioritize or have any feedback on the release, please let us know!
Help work on the 1.14 release: RethinkDB is hiring.