04/28/22

Taming SQL and ORMs with sqlc — go get it #001

Achieving something better than the best of both worlds with sqlc.

7 Min Read

Welcome to the very first post in go get it, our series on excellent Go packages and tools that deserve a spotlight. First out of the gate is a tool we're using extensively: sqlc.

At Encore we’re building the backend development engine that enables developers to escape the maze of complexity in building for the cloud. Static analysis — parsing code and deriving insights from that — is core to everything we do. So imagine our delight when we came across sqlc and learned that it’s reinventing the concept of ORMs by using static analysis on the structure of your database!

A graveyard littered with ORMs

ORMs (or "Object-Relational Mappers"), have a special place in my heart, not entirely unlike Brutus and Caesar: a dear friend who betrays you and leaves you to die a slow, painful death.

ORMs tend to have several alluring features:

  • Type safety! You don’t have to write stringly typed SQL: your typical ORM provides objects with methods that you can manipulate in a language-native way, resulting in compile-time errors if you use it incorrectly or pass the wrong type of value for a given column.
  • Auto-completion! Similarly, by providing a language-native API you can easily get auto-completion for the ORM’s query language.
  • Simplicity! ORMs usually provide a much simpler approach to writing queries, usually with several comparisons that show how much simpler queries are, with less boilerplate.

What’s not to like?!

In practice, though, ORMs tend to become “yet another query language”, eventually growing to the same level of complexity as SQL. And while the simple examples are simple to reason about, once you get into complex queries they’re suddenly not very simple.

Since your database will inevitably execute SQL, it’s critical to know what the SQL translation for a particular query looks like, and with complex queries that’s often not at all obvious. So you inevitably end up thinking “I need to write an ORM expression that yields a query that looks like WITH heeelp SELECT * FROM im_trapped RIGHT OUTER JOIN in_the RANK() OVER database” or, in reality, some arcane incantation 100x more complicated.

When you start out thinking in SQL and trying to figure out how to reverse-engineer your ORM query to yield the SQL you want, the game’s over. Who’s the winner, you ask? We’re all losers. The database speaks SQL, so we’re forced to think in SQL, but then we invent a whole different language that kinda resembles SQL except not really so we get compile-time type safety. It’s a shit deal either way you spin it.

Introducing sqlc

sqlc lets you have your bottle of wine (or Kool-Aid if you prefer) and drink it too. It neatly side-steps the ORM dilemma by letting you write SQL but still giving you the all the benefits of the ORM. How? ~Magic~ Static Analysis!

sqlc lets you write queries in a plain .sql file. Alongside each query you give it a name and specify what it returns: no rows, one row, or many rows. For example:

-- name: GetAuthor :one SELECT * FROM authors WHERE id = $1; -- name: ListAuthors :many SELECT * FROM authors ORDER BY id; From this sqlc generates a function for each query (with the same name as the query). Each query parameter (`$1` in the example above) becomes an argument to the function. The above might generate: type Author struct { ID int Bio string BirthYear int } const getAuthor = `-- name: GetAuthor :one SELECT id, bio, birth_year FROM authors WHERE id = $1 ` func (q *Queries) GetAuthor(ctx context.Context, id int) (Author, error) { row := q.db.QueryRowContext(ctx, getAuthor, id) var i Author err := row.Scan(&i.ID, &i.Bio, &i.BirthYear) return i, err }

This is exactly the boilerplate you would write yourself if you were writing SQL by hand!

But if you look a bit closer you might be thinking I made a mistake. The original query said SELECT * FROM authors but the generated code lists SELECT id, bio, birth_year. Where does sqlc get these column names from? That, my friend, is what makes sqlc a package worth getting.

What makes sqlc special is that it understands your database structure, and uses that understanding to validate the SQL you write. So while it may look like regular, stringly-typed SQL, it’s actually being validated against the actual database table. If you have a typo in a column name, sqlc will give you a compile-time error. That’s incredibly valuable , and single-handedly solves one of the big drawbacks of SQL compared to ORMs.

You might think sqlc understands the database structure by connecting to your database. You might also think something else entirely, in which case you’re less likely to be completely wrong. sqlc works by parsing your migration files. When you’re developing a backend application over time you make incremental changes to the database schema, using ALTER TABLE and similar queries (called “DDL” or “Data Definition Language” if you want to wield some 80’s jargon like it’s still cool).

sqlc understands these DML queries and parses them to incrementally update its understanding of your database schema, representing the way your database changes incrementally over time. As a result you can just write a migration file and then immediately re-generate your type-safe database queries with sqlc. It’s an incredibly simple yet effective workflow. We use it internally to build the Encore Platform and it’s been a massive success.

Drawbacks

As with all engineering decisions, nothing is a free dinner (unless you’re athletic, in the mood for some dining and dashing, and have no problem flirting with the left side of the law) and it’s important to be aware of the drawbacks in order to make good decisions. sqlc is no different.

In our time with sqlc so far we’ve come across two main drawbacks to be aware of.

First of all, its static analysis isn’t perfect. Sometimes you have to manually specify the type of an expression or column (using $1::text or similar to cast it). Sometimes you make a mistake and its compile-time type checking doesn’t catch it.

While I’ve mostly written this post with the perspective that SQL is better than ORMs, make no imistake — SQL is a pretty awful language. It’s incredibly complex, inconsistent, and more or less impossible to parse. In fact it’s so difficult to parse that under the hood sqlc uses PostgreSQL’s own parsing library — because writing your own parser is a faster path to insanity than locking yourself in a room with baby pigs masquerading as clowns. Or something.

So it’s no surprise that sqlc doesn’t catch all possible type-checking issues with SQL, but it’s a drawback to keep in mind.

The second drawback is that SQL queries, especially more complex ones, contain a lot of nuanced semantics, that can affect and be affected by the surrounding code. By using sqlc you’re moving the precise SQL from being inlined in the surrounding code to living in a separate file, and leaving in its place a function call with a descriptive name like GetAuthor.

For simple queries this is a no-brainer, but for more complex ones you have to be careful with naming your queries to ensure the semantics of your query are fully described by the name. If they aren’t you can end up thinking a query does something different to what it actually does, which is difficult to catch in code review.

Fortunately sqlc supports documentation comments which helps explain what a more complex query does.

Conclusion

If you’re feeling stuck, like us, in the uncanny valley of despair between SQL — with its stringly-typed nature and large amounts of boilerplate — and ORMs that somehow manage to make things even worse, sqlc can help!

It really is an amazing tool that manages to take a “lose-lose” situation and turns it into something wonderful. SQL is nobody’s favorite (I think? hope?), but at the same time it’s the best we’ve got. It’s better to embrace it and solve its problems, which sqlc does beautifully.

Check back next week for go get it #002!

Updated 2022-05-01: Corrected "Data Manipulation Language" to "Data Definition Language". Thanks Marios Nikolaou for the feedback.

Encore

This blog is presented by Encore, the development platform for building robust type-safe distributed systems with declarative infrastructure.

Like this article?
Get future ones straight to your mailbox.

You can unsubscribe at any time.