Publish and subscribe entirely in RethinkDB

With RethinkDB's changefeeds, it's easy to create a publish-subscribe message exchange without going through a third-party queue. Josh Kuhn (@deontologician) has written a small library, repubsub, that shows you how to build topic exchanges—and he's written it in all three of our officially-supported languages. He's put together a terrific tutorial article demonstrating how to use it. You can simply create a topic and publish messages to it:

topic = exchange.topic('fights.superheroes.batman')
topic.publish({'opponent': 'Joker', 'victory': True})

Then subscribe to just the messages that match your interest.

filter_func = lambda topic: topic.match(r'fights\.superheroes.*')
queue = exchange.queue(filter_func)
for topic, payload in queue.subscription:
    print topic, payload

Josh describes how to implement tags, nested topics and more, so check out the publish-subscribe tutorial.

September Events in San Francisco

Join the RethinkDB team at the following events for September 2014:

Meetup: RethinkDB SF Group at Heavybit Industries

Thursday, September 11th at 6pm, Heavybit Industries, 325 Ninth Street (map)

RethinkDB co-founder, Slava Akhmechet, will be talking about the latest RethinkDB advances and where it's headed; come meet the founders & engineers!
Get architectural advice, improve your code, give the RethinkDB team product feedback, and catch a sneak peek of upcoming features. Talks will start at 7pm. Food and drinks provided.

RSVP here

Office Hours with Slava Akhmechet

Tuesday, September 16th from 11am - 4pm, Workshop Cafe, 180 Montgomery Street #100 (map)

Sign up to get one-on-one RethinkDB support with Slava Akhmechet during our office hours in San Francisco.
Learn how to get up and running with RethinkDB, get individual support on your project, or just enjoy a cup of coffee with us!
We have five (5) 45 minute time slots available (11am, 12pm, 1pm, 2pm, & 3pm).
Please contact christina@rethinkdb.com to reserve your time.

Meetup event here

Meetup: Building realtime apps with RethinkDB

Monday, September 22nd at 6:30pm, PubNub, 725 Folsom St (map)

RethinkDB has teamed up with DevBrill for their September meetup.
Slava will demo RethinkDB, show how to get started with storing and querying JSON data in RethinkDB, and how to scale and parallelize queries across multiple machines.
The talk will also include the highlights of some of the more unique features like 'r.http' and changefeeds, an overview of the tradeoffs involved in building a distributed architecture, and a discussion on the future of realtime applications.
Talks will start at 7pm. Food and drinks provided.

RSVP here

If you have questions or would like to speak at any of our events, please contact christina@rethinkdb.com.

RethinkDB 1.14: binary data, seamless migration, and Python 3 support

Today, we're happy to announce RethinkDB 1.14 (). Download it now!

The 1.14 release includes over 50 enhancements including:

  • Seamless migration (read more below)
  • Simple binary data support
  • Python 3 support
  • Support for returning changes from multiple writes
  • Better documentation
  • New options for handling conflicts on inserts
  • Dozens of stability and performance improvements

Upgrading to 1.14? You no longer need to migrate your data between point releases! Read below for more information.

If you're upgrading from version 1.12 or earlier, you will need to migrate your data one last time.

Upgrading on Ubuntu? If you're upgrading from 1.12 or earlier, first set up the new RethinkDB PPA.

Upgrading from 1.13 with seamless migration

1.14 is the first RethinkDB release that doesn't require you to migrate your data. Just upgrade the package and restart your RethinkDB processes. Your 1.14 cluster will be ready to go immediately after restarting! This is something people have been asking for since our first release, and we're happy to finally be able to provide it. Making upgrades easier is a big step towards production readiness.

If you have secondary indexes on your data, the web UI may show an issue for those indexes after upgrading. This means that there was a bug fix affecting those indexes, and they need to be recreated to get the new behavior. You can learn how to do that on the troubleshooting page.

Binary data support

Support for storing small chunks of binary data has been one of our most-requested features. Starting with 1.14, you can insert binary data directly with r.binary, and retrieve it like any other part of a row.

Binary data works with everything: it can be stored anywhere in your document structure, and you can index on it like any other data type. (That means you can use binary data as the primary key of a row, or as the value of a secondary index.)

> r.table('users').insert({
    'name': 'Sam Lowry',
    'avatar': r.binary(open('sam_lowry.png', 'rb').read()),
    }).run(conn)
{'replaced': 0, 'inserted': 1, 'skipped': 0, 'deleted': 0, 'unchanged': 0, 'errors': 0}
# In python > 3.0, the 'bytes' type will be used to represent binary data
> r.table('users').filter({'name': 'Sam Lowry'}).run()
{'avatar': b'...', 'name': 'Sam Lowry'}

The r.http command has also gained the ability to return binary data. r.http will try to detect whether it is downloading binary data and return the appropriate type. You can also request that it return binary data with the result_format argument:

> r.table('users').insert({
    'name': 'Jill Layton',
    'avatar': r.http('http://example.com/jill_layton.jpg', result_format='binary')
    }).run(conn)

Binary data is stored inline in your rows, so it's well-suited to storing small images and files, but aren't a good fit for 10GB movies.

Python 3 driver

Thanks to contributions from @grandquista and @barosl, the RethinkDB Python driver has added support for Python 3.0 through 3.4.

Now you can use awesome Python 3 features like yield from with RethinkDB:

def query_twice(reql_query, conn):
    yield from reql_query.run(conn)
    yield from reql_query.run(conn)

Previously, Python 3 support was incomplete because there was no official Protocol Buffers implementation for Python 3. The previous release of RethinkDB added a JSON driver protocol, and Python 3 support was made possible by that work.

Returning changes from queries with multiple writes

This was another much-requested feature. In 1.13 and earlier, we allowed users to return the old and new values of a row when updating a single document. We've changed this interface to be consisted with the changes API and added support for returning changes on any query that does a write.

For example, if you have a table of users where every user has a score:

> r.table('users').run(conn)
[{'id': 'Buttle', 'score': 20},
 {'id': 'Tuttle', 'score': 7},
 ...]

Then you can atomically increment-and-return Buttle and Tuttle's scores like so:

> r.table('users') \
   .get_all('Buttle', 'Tuttle') \
   .update(lambda row: {'score': row['score'] + 1}) \
   .run(conn, return_changes=True)
{'changes':
  [{'new_val': {'id': 'Buttle', 'score': 21},
    'old_val': {'id': 'Buttle', 'score': 20}},
   {'new_val': {'id': 'Tuttle', 'score': 8},
    'old_val': {'id': 'Tuttle', 'score': 7}}],
 'deleted': 0,
 'errors': 0,
 'inserted': 0,
 'replaced': 2,
 'skipped': 0,
 'unchanged': 0}

Improved documentation

@chipotle has been improving our documentation for the last three months. You can see his work in the greatly expanded map-reduce docs, as well as new pages on importing your data, using nested fields, and database limitations.

Overall, there have been hundreds of improvements to the docs since the last release. Excellent docs have always been something we've strived for, and having someone working on them full time ensures they'll always be high quality and up to date.

Handling conflicts on insert

Previously, the insert command supported the upsert optional argument. This allowed you to insert or replace a document. In 1.14 we replaced the upsert argument with the conflict argument, and added the ability to update an existing document, rather than overwrite it completely.

As an example, let's assume you have two web crawlers that get ratings for movies from both IMDB and Rotten Tomatoes. In general, we don't know which crawler will get to a particular movie first. In this case, the IMDB crawler already inserted its document for the movie Brazil:

> r.table('movies').get('Brazil (1985)').run(conn)
{'id': 'Brazil (1985)',
 'imdb_rating': 8.0 }

By default, if the Rotten Tomatoes crawler tries to do an insert with the key "Brazil (1985)", we'll get an error, since the document already exists. But if instead it uses conflict='update', the document will simply be updated:

> r.table('movies').insert({
    'id': 'Brazil (1985)',
    'rt_rating': 98,
    },
    conflict='update').run(conn)
> r.table('movies').get('Brazil (1985)').run(conn)
{'id': 'Brazil (1985)',
 'imdb_rating': 8.0,
 'rt_rating': 98}

You can get the previous upsert behavior with conflict='replace'.

Next steps

See the full list of enhancements, and take the new release for a spin!

The team is already hard at work on the upcoming 1.15 release that will likely include geospatial query support. As always, if there is something you'd like us to prioritize or have any feedback on the release, please let us know!

Help work on the 1.15 release: RethinkDB is hiring.

Join us online for a live presentation and Q&A for RethinkDB 1.14

Join us online on Thursday, August 21st at 1:30pm for a live presentation and Q&A session presented by RethinkDB's co-founder Slava Akhmechet. You'll learn about new and upcoming features in RethinkDB 1.14, including:

  • Changefeeds: get realtime push notifications of changes in the database.
  • r.http: Don't fetch data from the internet and store it in the database, get the database to fetch it for you!
  • Binary data types: store images, zip files, and arbitrary binary data in a field.
  • Other features in RethinkDB 1.14: Promises in the JavaScript driver, Python 3 support, and seamless migrations.

To attend, register for the webinar, and follow the instructions you receive via email.

If you have any problems signing up or any questions about the event, please contact christina@rethinkdb.com.

Feed RethinkDB changes directly into RabbitMQ

RethinkDB's new changefeeds let your applications subscribe to changes made to a table in real-time. They're a perfect match with a distributed message queue system like RabbitMQ: changes can be sent from RethinkDB to a RabbitMQ topic exchange with only a few extra lines of code. RabbitMQ then queues them to pass on to any client subscribed to that exchange. If you need to to send information about those changes to a large number of clients as efficiently as possible, RabbitMQ is the rodent you need. Imagine a changefeed for real-time stock updates being distributed to a thousand terminals on a trading floor.

@deontologician has written an integration tutorial on using RethinkDB with RabbitMQ, and he's provided it for all three of the languages we support: JavaScript (using ampqlib for Node.js), Python (using pika), and Ruby (using Bunny). Even if you're not using one of those languages, the basic techniques in the article should get you going.

Check out Integrating RethinkDB with RabbitMQ!