Drizzle: The Future of MySQL
Brian Aker, one of the main engineers on mysql at Sun, has posted a presentation he did on the project he’s been working on for the last year and a half: Drizzle. I highly recommend anyone who’s interested in the state of the art of database technology watch it.
To summarize:
- Scalability: A large part of the effort so far is along the lines of making it so that the system can scale to massive numbers of threads (and processes). They’re removing locks wherever possible and aiming for systems with 100+ cores.
- UTF-8: This is a hugely important move. Drizzle talks exclusively in UTF-8 and bytestreams. This pushes all the character set insanity out to the client, which is really where it belongs. Unfortunately, this will probably be a stumbling block for some apps that have data that can’t be easily converted.
- Protobuffer replication streams: Using google’s protobuffer protocol to put out replication information makes it really easy to write applications that do things based on the replication stream. With mysql binlogs, this was a fairly tedious thing to do and resulted in fragile code.
- Async protocol: This is really useful. A page load should be able to spam the server with a bunch of queries and then fetch results as needed rather than doing them one at a time. This is a big part of taking advantage of higher concurrency and reducing pageload times.
- Built in sharding: This is also really useful. I’m not entirely sure what their plan is, because this is the first I heard of it being part of the project, but if done right this will be so valuable. Sites that need to shard often wind up implementing this from scratch. I’ve been involved in doing so myself. It certainly isn’t as scary as a lot of people think it should be, but the fear is palpable among other devs and a solid baseline implementation would raise the state of the art a good deal.
- Plugins: Plugins are a big part of drizzle’s re-architecting. The goal seems to be to completely ground up make it as simple as possible in the core (slides say 350k loc as opposed to 6.0’s > 1m loc) and push all extra functionality out to plugins. Areas subject to becoming plugins include:
- Pluggable client protocols: Making it so that the client can talk in an HTTP/REST protocol for simplicity, or any protocol desired.
- Pluggable logging: Have it log out to syslog, for example. Or to an analysis app that does custom slow query logging, etc.
- Pluggable authentication: Turn off auth altogether, use the system’s user accounts through PAM (yes, please!), LDAP, HTTP AUTH, or just something custom. This also helps remove locks for scalability apparently.
I can’t stress enough how this is the real future of MySQL, far moreso than any future versions of mainline mysql are. As the world of the web moves towards simpler databases like couchdb, drizzle is the only way that mysql will manage to be competitive on the web. Mainline mysql just keeps getting bigger and heavier, growing towards enterprise use (towards being an Oracle replacement, really) leaving those of us who don’t need or can’t afford those features (not in $, but in response time) out in the cold.
Personally, I think that object/document store databases are the future of databases for the web as a whole, but Drizzle is the future of the particular subset where schemas are still important. And for the time being, it will come out of the gates as the most mature product among next-gen web databases by the simple fact of its inheritance of mysql architecture.
I’m going to be keeping my eye on Drizzle, and I think other people should too. Brian Aker has a blog and a twitter, and Drizzle itself is on LaunchPad.



