The Shift to Distributed SQL

garyorenstein
2 min readJul 15, 2019

Two years ago, I made light of this possibility: Did Google Send the Big Data Industry on a 10 Year Head Fake?

The Shift to Distributed SQL

Long story short it seemed Hadoop had passed and Spanner was in.

Fast forward to last week when YugaByte launched the first Distributed SQL Summit. When describing their product, the YugaByte About page reads, “built ground-up with inspiration from Google Spanner.”

That reminded me of another company often mentioned in Spanner discussions, curiously and unforgettably named CockroachDB. According to Wikipedia, CockroachDB was “founded in 2015 by ex-Google employees” and “all three [co-founders] had previously used Bigtable and were acquainted with its successor, Spanner.”

Both YugaByte and CockroachDB are coalescing around distributed SQL.

On the Cockroach Labs product page the primary heading is “CockroachDB: Distributed SQL.”

On the YugaByte hope page the first product description reads, “The high-performance distributed SQL database for global, internet-scale apps.”

The distributed SQL talk further triggered memories of yet another company, Crate.io. Crate does not appear to promote as much of direct Spanner connection. For example, Spanner appears multiple times on the YugaByte and Cockroach Labs websites, but does not on Crate.io.

However, top line messaging on the CrateDB product page describes “The distributed SQL database for machine data.”

And a public Crate.io presentation cites both F1 and Spanner as a 2nd wave of inspiration driving new approach to analytical SQL databases.

For background on F1, here is the paper title and abstract from 2013

F1: A Distributed SQL Database That Scales
F1 is a distributed relational database system built at Google to support the AdWords business. F1 is a hybrid database that combines high availability, the scalability of NoSQL systems like Bigtable, and the consistency and usability of traditional SQL databases. F1 is built on Spanner, which provides synchronous cross-datacenter replication and strong consistency.

7 years after the public emergence of Spanner, and 2 years after the second act of Spanner in 2017 with a paper titled Spanner: Becoming a SQL System, distributed SQL is a thing.

It seems clear that 2020 will be the year database administrators reacquaint themselves with SQL, this time fully distributed. Spanner, CockroachDB, CrateDB, and YugaByte are just a few of the ways that might happen.

--

--