Backward compatibility is straightforward. You have full control over new code and you have full knowledge of past data and APIs. Forward compatibility is more challenging. You have full control over new code, but you don't know how data is going to change in the future, and what types of API you're going to have to support.
There are many best practices for maintaining backward and forward compatibility in application code, but it's not very commonly mentioned in relation to SQL. SQL is used to produce critical business information for applications and decision-making, so there's no reason it shouldn't benefit from similar practices.
In this article, I present one simple way to future-proof SQL.
Table of Contents
A Simple Payment System
Say you have a payment system where your customers can charge their customers for products. The table can look like this:
db=# CREATE TABLE payment (
id INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
method TEXT NOT NULL
CONSTRAINT payment_method_check CHECK (method IN ('credit_card', 'cash')),
amount INT NOT NULL
);
CREATE TABLE
You provide your users with two payment options, cash or credit card:
db=# INSERT INTO payment (method, amount) VALUES
('cash', 10000),
('credit_card', 12000),
('credit_card', 5000);
INSERT 0 3
db=# SELECT * FROM payment;
id │ method │ amount
────┼─────────────┼────────
1 │ cash │ 10000
2 │ credit_card │ 12000
3 │ credit_card │ 5000
(3 rows)
Calculating the commission
To charge for your service, you use the following query to calculate the commission for each payment based on the payment method:
-- calculate_commission.sql
SELECT
COUNT(*) AS payments,
SUM(
CASE method
WHEN 'cash' THEN 100
WHEN 'credit_card' THEN 30 + amount * 0.02
END
) AS commission
FROM
payment;
For cash payments you charge a flat fee of 1$ (100 cents), and for credit card payments you charge a flat fee of 30 cents plus 2% of the charged amount.
This is the commission for the first 3 payment processes:
db=# \i calculate_commission.sql
payments │ commission
───────────┼────────────
3 │ 500.00
(1 row)
Congrats! You just made your first 5$.
Adding a new payment method
Time goes by and your payment system is becoming a real hit! Demand for your service is skyrocketing and your customers ask for more payment methods. You give it some careful thought and decide to introduce a new payment method - a bank transfer:
db=# ALTER TABLE payment DROP CONSTRAINT payment_method_check;
ALTER TABLE
db=# ALTER TABLE payment ADD CONSTRAINT payment_method_check
CHECK (method IN ('credit_card', 'cash', 'bank_transfer'));
ALTER TABLE
A few more months go by and the new payment method is proving to be a real hit:
INSERT INTO payment (method, amount) VALUES
('bank_transfer', 9000),
('bank_transfer', 15000),
('bank_transfer', 30000);
INSERT 0 3
You process more payments than you ever imagined, but something is off:
db=# \i calculate_commission.sql
payments │ commission
──────────┼────────────
6 │ 500.00
(1 row)
You process all these payments but your revenue stays the same, how come?
Future Proofing SQL
When you added the new payment method you did not edit the query that calculates the commission. The query never failed, no exception or warning was raised and you completely forgot about it!
This type of scenario is pretty common. SQL is usually not statically checked, so unless you have automated tests for this specific query, it can easily go unnoticed!
Failing on purpose
Errors have a bad reputation, but in fact, they are pretty useful. If the query threw an error when it encountered an unknown payment method, you could have caught this mistake and fixed it immediately.
Recall the query to calculate the commission:
SELECT
COUNT(*) AS payments,
SUM(
CASE method
WHEN 'cash' THEN 100
WHEN 'credit_card' THEN 30 + amount * 0.02
END
) AS commission
FROM
payment;
The query uses a CASE
expression to calculate the commission for each payment method. The expression is not defining what should happen when the method does not match any of the WHEN
expressions, so the expression implicitly evaluates to NULL
, and the aggregate function ignores it.
What if instead of implicitly evalauting to NULL
we trigger an error?
Assert never in SQL
To trigger an error in PostgreSQL we can write a simple function:
CREATE OR REPLACE FUNCTION assert_never(v anyelement)
RETURNS anyelement
LANGUAGE plpgsql AS
$$
BEGIN
RAISE EXCEPTION 'Unhandled value "%"', v;
END;
$$;
The function accepts an argument of any type, and raises an exception:
db=# SELECT assert_never(1);
ERROR: Unhandled value "1"
CONTEXT: PL/pgSQL function assert_never(anyelement) line 3 at RAISE
To trigger an error when the query encounter an unknown value, we can call it when the expression reaches the ELSE
part:
db=# SELECT
COUNT(*) AS payments,
SUM(
CASE method
WHEN 'cash' THEN 100
WHEN 'credit_card' THEN 30 + amount * 0.02
ELSE assert_never(method)::int
END
) AS commission
FROM
payment;
ERROR: Unhandled value "bank_transfer"
CONTEXT: PL/pgSQL function assert_never(anyelement) line 3 at RAISE
This is great! The query encountered the unhandled payment method bank_transfer
, and failed. To error also includes the values we forgot to handle, which makes it especially useful for debugging.
The error forces the developer to handle the exception in one of the following ways:
- Explicitly exclude the unhandled value:
SELECT
COUNT(*) AS payments,
SUM(
CASE method
WHEN 'cash' THEN 100
WHEN 'credit_card' THEN 30 + amount * 0.02
ELSE assert_never(method)::int
END
) AS commission
FROM
payment
WHERE
method IN ('cash', 'credit_card');
payments │ commission
──────────┼────────────
3 │ 500.00
The developer can decide to exclude this value explicitly. Maybe it's not relevant, maybe it's being handled by a different query. Either way, the value is now excluded explicitly and not simply overlooked.
- Handle the new value:
SELECT
COUNT(*) AS payments,
SUM(
CASE method
WHEN 'cash' THEN 100
WHEN 'credit_card' THEN 30 + amount * 0.02
WHEN 'bank_transfer' THEN 50
ELSE assert_never(method)::int
END
) AS commission
FROM
payment;
payments │ commission
──────────┼────────────
6 │ 650.00
The developer spotted the mistake and added the commission for the unhandled payment method to the query. Mistake averted!
In both cases, the results are now accurate, and the query is safer.
Assert never
Exhaustive checking is a common pattern in many languages to make sure all possible values are handled. I wrote about exhaustive checking in Python in the past, where I demonstrated how to implement a similar function named assert_never
in Python.
Fortunately, since the article has published, the function assert_never
found its way into the built-in typing module in Python 3.11, and it can be used to perform exhaustive checking:
from typing import assert_never, Literal
def calculate_commission(
method: Literal['cash', 'credit_card', 'bank_transfer'],
amount: int,
) -> float:
if method == 'cash':
return 100
elif method == 'credit_card':
return 30 + 0.02 * amount
else:
assert_never(method)
Running this code in Mypy, an optional static type checker for Python, will produce the following error:
error: Argument 1 to "assert_never" has incompatible type "Literal['bank_transfer']";
expected "NoReturn"
Just like the assert_never
function in SQL, the error warns about an unhandled value "bank_transfer". Unlike the function in SQL, this won't fail in run-time, but during static analysis.
Failing Without a Function
If for some reason you can't or don't want to use functions, there are other ways to trigger errors in SQL.
Abusing division-by-zero
The go-to way to trigger errors in any programing language is to divide some number by zero:
SELECT
COUNT(*) AS payments,
SUM(
CASE method
WHEN 'cash' THEN 100
WHEN 'credit_card' THEN 30 + amount * 0.02
ELSE 1/0 -- intentional
END
) AS commission
FROM
payment;
ERROR: division by zero
Instead of returning NULL
when the method is not handled, we divide 1 by 0 to trigger a zero division error. The query failed as we wanted, but this is not working as we might expect.
Consider the following scenario where all the possible payment methods are handled:
SELECT
COUNT(*) AS payments,
SUM(
CASE method
WHEN 'cash' THEN 100
WHEN 'credit_card' THEN 30 + amount * 0.02
WHEN 'bank_transfer' THEN 50
ELSE 1/0 -- fail on purpose
END
) AS commission
FROM
payment;
ERROR: division by zero
This query handled all possible payment methods but it still failed - this is no good. If we look at the documentation for CASE
, it is clear why:
there are various situations in which subexpressions of an expression are evaluated at different times, so that the principle that “CASE evaluates only necessary subexpressions” is not ironclad. For example a constant 1/0 subexpression will usually result in a division-by-zero failure at planning time, even if it's within a CASE arm that would never be entered at run time.
The documentation explains it well. While CASE
normally evaluates only the necessary expressions, there are cases where expressions using only constants, such as 1/0
, are evaluated at planning time. This is why the query failed even though the database did not have to evaluate the expression in the ELSE
clause.
Abusing cast
Another popular genre of errors is casting errors. Let's try to trigger an error by converting a value to an incompatible type:
SELECT
COUNT(*) AS payments,
SUM(
CASE method
WHEN 'cash' THEN 100
WHEN 'credit_card' THEN 30 + amount * 0.02
ELSE method::int
END
) AS commission
FROM
payment;
ERROR: invalid input syntax for type integer: "bank_transfer"
We attempt to cast the text value in the column method
to an integer, and the query failed. As a bonus, the error message provides us with the bad value, "bank_transfer"
, which makes it easy to identify the unhandled value.
Let's also check that the query is not failing when all methods are handled:
SELECT
COUNT(*) AS payments,
SUM(
CASE method
WHEN 'cash' THEN 100
WHEN 'credit_card' THEN 30 + amount * 0.02
WHEN 'bank_transfer' THEN 50
ELSE method::int
END
) AS commission
FROM
payment;
payments │ commission
──────────┼────────────
6 │ 650.00
When the query handles all the possible values for method
, it does not fail!
Abusing cast for none-text types
If you use this technique long enough you'll find that triggering a casting error requires some creativity. Triggering a casting error for textual values like the above is usually easier - just cast to integer and most chances are it will fail.
However, if you have an integer type, what type would you cast it to to trigger an error? This is what I came up with after some time:
SELECT
CASE n
WHEN 1 THEN 'one'
WHEN 2 THEN 'two'
ELSE ('Unhandled value ' || n)::int::text
END as v
FROM (VALUES
(1),
(2),
(3)
) AS t(n);
ERROR: invalid input syntax for type integer: "Unhandled value 3"
It's not as elegant, but it gets the job done. We triggered an error, and we get a useful error message we can act on.