Building a world-class team: six mistakes I made early in my career

For the past few weeks my primary job at RethinkDB has been to hire world- class software developers. Recruiting great people is a difficult process with at least seven components: sourcing candidates, reviewing resumes, doing technical phone screens, conducting technical interviews, closing candidates, extending offers, and keeping candidates happy once they’ve joined. Each component seems simple in principle, but is very subtle in practice. A testament to this is that all software companies aspire to hire only the best people, but in practice very few companies achieve this goal.

Read the full post

High scalability: SQL and computational complexity

Recently there has been a lot of discussion on fundamental scalability of traditional relational database systems. Many of the blog posts on this topic give a great overview of some of the immediate issues faced by engineers while scaling relational databases, but don’t dissect the problem in a systematic way and with sufficient depth to get to the core issues. I’d like to dedicate a series of blog posts to the problem of scalability and how it pertains to relational databases.

Read the full post

More on alignment, ext2, and partitioning on SSDs

In our previous post we touched on alignment issues on solid-state drives. Our test read different-sized blocks from various random points on a raw device, aligned to a particular boundary. Today we’d like to expand on that work, and discuss how other factors affect SSD read performance. In addition to testing different block sizes and alignment boundaries, we tested two other factors: how the drive is partitioned, and what filesystem is used.

Read the full post

Page alignment on SSDs

In our previous post we discussed the optimal block-size for B-trees on solid-state drives. A few people mentioned page alignment - an issue that can cause serious performance hits on SSDs if unaccounted for. It’s a complex topic, and we will dedicate two posts to its discussion. In this post we’ll address alignment behavior while reading directly from the block device. In the next post, we’ll talk about partitioning the drive, and the effects of reading from the filesystem instead of reading from the device directly.

Read the full post

Rethinking B-tree block sizes on SSDs

One of the first questions to answer when running databases on SSDs is what B-tree block size to use. There are a number of factors that affect this decision:

  • The type of workload
  • I/O time to read and write the block size
  • The size of the cache

That’s a lot of variables to consider. For this blog post we assume a fairly common OLTP scenario - a database that’s dominated by random point queries. We will also sidestep some of the more subtle caching effects by treating the caching algorithm as perfectly optimal, and assuming the cost of lookup in RAM is insignificant.

Read the full post
prev Older posts Newer posts next