Erik Bernhardsson    About

I don't want to learn your garbage query language

This is a bit of a rant but I really don’t like software that invents its own query language. There’s a trillion different ORMs out there. Another trillion databases with their own query language. Another trillion SaaS products where the only way to query is to learn some random query DSL they made up.

I just want my SQL back. It’s a language everyone understands, it’s been around since the seventies, and it’s reasonably standardized. It’s easy to read, and can be used by anyone, from business people to engineers.

Instead, I have to learn a bunch of garbage query languages because everyone keeps trying to reinvent the wheel.

Take ORMs. Their alleged benefit is they cut down development time. But instead of writing SQL which everyone knows, I know how to scroll back and forth in some ORM documentation to figure out how to write my queries. On top of that, I have to spend time debugging why the ORM translated my query into some monstrosity that joins 17 tables using a full table scan. Instead of sticking to SQL, where it’s reasonably easy to argue about the performance (try to stick to where clauses on indexed columns, don’t go bananas with joins, et cetera), I have to deal with this opaque translation layer that obscures the exact query. And I end up with bloated higher level data classes rather than easy to understand tuples or dicts that contain the data in a dumb simple format that is trivial to introspect.

Not to mention there’s like five thousand ORMs out there, so instead of learning SQL once, I have to learn 34 different ORMs. It’s not like people learn an ORM instead of learning SQL anyway.

And all these SaaS products. Just to pick some tools from my company’s stack:

  • Splunk has SPL
  • Mixpanel has JQL
  • Rollbar has RQL
  • New Relic has NRQL
  • Adwords has AWQL

What’s worse than data silos? Data silos that invent their own query language.

To be fair, some of these are SQL flavors, or at least pretends to be, but all with their own quirks that forces me to unlearn everything I knew about SQL to the point that it might as well be something completely different.

Then on top of that, every database seems to reinvent query languages. Mongo has its own terrible query language that I never understood. Lucene has its own query language. Etc.

What am I asking for? Not a whole lot. Just that:

  1. Every SaaS product should offer a plug-and-play thing so that I can copy all the data back into my own SQL-based database (in my case, Postgres/Redshift). I don’t want to use their custom made DSL. Maybe European Union can mandate this as the next step after their PSD2 open banking directive.
  2. There should be a 30 year moratorium on inventing new query languages.
  3. Let’s dispel with the myth that ORMs make code cleaner. Join the embedded-SQL movement and discover a much more readable, much more straightforward way to query databases.

dsls

That’s it. I realize I sound like a old crank but that’s a risk I’ll take.

Addendum

This post got a fair amount of traffic so it must have resonated with a bunch people. See the Hacker News discussion and the Reddit r/programming comments.

Want to get blog posts over email?

Enter your email address and get weekly emails with new articles!

Erik Bernhardsson

... is the CTO at Better, which is a startup changing how mortgages are done. I write a lot of code, some of which ends up being open sourced, such as Luigi and Annoy. I also co-organize NYC Machine Learning meetup. You can follow me on Twitter or see some more facts about me.