The rethinkdb
utility includes an import
command to load existing data into RethinkDB databases. It can read JSON files, organized in one of two formats described below, or comma-separated value (CSV) files (including ones with other delimiters such as tab characters). The utility runs under the admin
user account (see Permissions and user accounts).
When the option is available, you should choose the JSON file format. If you’re exporting from a SQL database this might not be possible, but you might be able to write a separate script to transform CSV output into JSON, or use the mysql2json
script available as part of mysql2xxxx.
The full syntax for the import
command is as follows:
Import from a directory
rethinkdb import -d DIR [-c HOST:PORT] [--force] [-p]
[--password-file FILE] [-i (DB | DB.TABLE)] [--clients NUM]
[--shards NUM_SHARDS] [--replicas NUM_REPLICAS]
Import from a file
rethinkdb import -f FILE --table DB.TABLE [-c HOST:PORT] [--force]
[-p] [--password-file FILE] [--clients NUM] [--format (csv | json)]
[--pkey PRIMARY_KEY] [--shards NUM_SHARDS] [--replicas NUM_REPLICAS]
[--delimiter CHARACTER] [--custom-header FIELD,FIELD... [--no-header]]
Importing from a directory is only supported for directories created by the rethinkdb export
command.
Options for imports include:
-f
: file to import from--table
: table to import to--format
: CSV or JSON (default JSON)-c
: connect to the given IP address/host and port-p
, --password
: prompt for the admin password, if one has been set--password-file
: read the admin password from a plain text file--tls-cert
: specify a path to a TLS certificate to allow encrypted connections to the server (see Securing the cluster)--clients
: the number of client connections to use at once (default 8)--force
: import data even if a table already exists--fields
: only import from the listed fields--no-header
: indicate the first line of a CSV file is not a header row--custom-header
: supply a custom header row for a CSV file(Some of these options only apply to file imports, and there are other options available. Type rethinkdb help import
for a full list.)
To import the file users.json
into the table test.users
, you would use:
rethinkdb import -f users.json --table test.users
If it were a CSV file, you would use:
rethinkdb import -f users.csv --format csv --table test.users
By default, the import command will connect to localhost
port 28015
. You can use the -c
option to specify a server and client port to connect to. (Note this is the driver port clients connect to, not the cluster port.)
rethinkdb import -f crew.json --table discovery.crew -c hal:2001
If the cluster requires authorization, you can prompt for the admin
user account password with -p
, or supply a --password-file
to read the password from. (The password file is just a plain text file, with the password on the first and only line.)
rethinkdb import -f crew.json --table discovery.crew -c hal:2001 -p
A primary key other than id
can be specified with --pkey
:
rethinkdb import -f heroes.json --table marvel.heroes --pkey name
JSON files are preferred to CSV files, as JSON can represent RethinkDB documents fully. If you’re importing from a CSV file, you should include a header row with the field names, or use the --no-header
option with the --custom-header
option to specify the names.
rethinkdb import -f users.csv --format csv --table test.users --no-header \
--custom-header id,username,email,password
The CSV delimiter defaults to the comma, but this can be overridden with the --delimiter
option. Use --delimiter '\t'
for a tab-delimited file.
Values in CSV imports will always be imported as strings. If you want to convert those fields after import to the number
data type, run an update
query that does the conversion. An example runnable in the Data Explorer:
r.table('tablename').update(function(doc) {
return doc.merge({
field1: doc('field1').coerceTo('number'),
field2: doc('field2').coerceTo('number')
})
});
RethinkDB will accept two formats for JSON files:
An array of JSON documents.
[ { field: "value" }, { field: "value"}, ... ]
Whitespace-separated JSON rows.
{ field: "value" }
{ field: "value" }
In both cases, each documents is a JSON object, bracketed with { }
characters. Only the first format is itself a valid JSON document, but RethinkDB will import documents properly either way.
There are more options than what we’ve covered here. Run rethinkdb help import
for a full list of parameters and examples.
While import
has the ability to import a directory full of files, those files are expected to be in the format and directory structure created by the export
command.