Common table expressions (CTE), also known as the WITH clause, are a very useful feature. They help break down big queries into smaller pieces which makes it easier to read and understand.
PostgreSQL Version
This article is intended for PostgreSQL versions 11 and prior. Starting at version 12, PostgreSQL changed the way it treats CTE to prevent the issues described in this article.
What's So Dangerous?
Let's create a sample table with two columns and an index, and populate it with 1M random rows:
haki=# create table foo (id int, padding text);
CREATE TABLE
haki=# insert into foo (id, padding)
select id, md5(random()::text)
from generate_series(1, 1000000) as id
order by random();
INSERT 0 1000000
haki=# create index foo_id_ix on foo (id);
CREATE INDEX
haki=# analyze foo;
ANALYZE
To illustrate the problem with CTE, let's execute a simple query to fetch a single record from the table. First without using a CTE and than using a CTE:
haki=# select * from foo where id = 500000;
id | padding
------------------------------------------
500000 | b292eb19f3145fb087648d5956dfa44e
Time: 0.619 ms
haki=# with cte as (select * from foo) select * from cte where id = 500000;
id | padding
------------------------------------------
500000 | b292eb19f3145fb087648d5956dfa44e
Time: 227.675 ms
The first query took 0.619 ms while the second one took almost 300 times more, 227 ms. Why is that?
A lesser known fact about CTE in PostgreSQL is that the database will evaluate the query inside the CTE and store the results.
From the docs:
A useful property of WITH queries is that they are evaluated only once per execution of the parent query, even if they are referred to more than once by the parent query or sibling WITH queries. Thus, expensive calculations that are needed in multiple places can be placed within a WITH query to avoid redundant work. Another possible application is to prevent unwanted multiple evaluations of functions with side-effects
This sounds great, using CTE we can "cache" expensive calculations that are reused multiple times in the query, BUT...
However, the other side of this coin is that the optimizer is less able to push restrictions from the parent query down into a WITH query than an ordinary subquery.
Going back to the queries above, let's take a look at the execution plans:
haki=# explain (analyze on, timing on) select * from foo where id = 500000;
QUERY PLAN
------------------------------
Index Scan using foo_id_ix on foo (cost=0.42..8.44 rows=1 width=37)
(actual time=0.026..0.028 rows=1 loops=1)
Index Cond: (id = 500000)
Execution time: 0.060 ms
In the simple query without the CTE PostgreSQL used the index on the ID field to quickly locate the desired record. Simple and fast.
The execution plan using the CTE is a bit different:
haki=# explain (analyze on, timing on)
with cte as (
select * from foo
)
select * from cte where id = 500000;
QUERY PLAN
------------------------------
CTE Scan on cte (cost=18334.00..40834.00 rows=5000 width=36)
(actual time=3.243..269.290 rows=1 loops=1)
Filter: (id = 500000)
Rows Removed by Filter: 999999
CTE cte
-> Seq Scan on foo (cost=0.00..18334.00 rows=1000000 width=37)
(actual time=0.029..77.078 rows=1000000 loops=1)
Execution time: 276.625 ms
PostgreSQL materialized the CTE, meaning, it created a temporary structure with the results of the query defined in the CTE, and only then applied the filter to it. Because the predicate was not applied on the table (but the CTE) PostgreSQL was unable to utilize the index on the ID column.
The overall cost of the second query is significantly higher than the first one. It's essentially equivalent to two full table scans plus extra memory to store the CTE result.
A possible alternative to CTE is a subquery. Let's see how the execution plan looks like when we inline the CTE as a subquery:
haki=# explain (analyze on, timing on)
select * from (select * from foo) as subquery where id = 500000;
QUERY PLAN
------------------------------
Index Scan using foo_id_ix on foo (cost=0.42..8.44 rows=1 width=37)
(actual time=0.028..0.031 rows=1 loops=1)
Index Cond: (id = 500000)
Execution time: 0.066 ms
The execution plan using the subquery is similar to the simple query without the CTE. PostgreSQL was smart enough to apply the predicate id = 500000
in the subquery and utilize the index.
Is it Possible to Prevent PostgreSQL From Materializing a CTE?
The short answer is not that I know of.
To illustrate the difference let's look at how Oracle behaves under similar circumstances (setup can be found here):
> SELECT * FROM foo WHERE id = 500000;
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 |
| 1 | TABLE ACCESS BY INDEX ROWID | FOO | 1 | 21 |
| * 2 | INDEX RANGE SCAN | FOO_ID_IX | 37 | 1 |
-----------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("ID"=500000)
> WITH cte AS (
SELECT * FROM foo
)
SELECT * FROM cte WHERE id = 500000;
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Cost |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 |
| 1 | TABLE ACCESS BY INDEX ROWID | FOO | 1 | 21 |
| * 2 | INDEX RANGE SCAN | FOO_ID_IX | 37 | 1 |
-----------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("FOO"."ID"=500000)
Unlike PostgreSQL, Oracle is not materializing CTEs by default and the two queries generate the same execution plan.
There is however, an undocumented hint in Oracle that can be used to force it to materialize the CTE the same way PostgreSQL does:
> WITH cte AS (
SELECT /*+ MATERIALIZE */ * FROM foo
)
SELECT * FROM cte WHERE id = 500000;
--------------------------------------------------------------
| Id | Operation | Name | Rows | Cost |
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9308 | 46 |
| 1 | TEMP TABLE TRANSFORMATION | | | |
| 2 | LOAD AS SELECT | SYS_TEMP | | |
| 3 | TABLE ACCESS FULL | FOO | 9308 | 22 |
| * 4 | VIEW | | 9308 | 24 |
| 5 | TABLE ACCESS FULL | SYS_TEMP | 9308 | 24 |
--------------------------------------------------------------
Predicate Information (identified by operation id):
------------------------------------------
* 4 - filter("ID"=500000)
In the execution plan we can see that Oracle created an in-memory temp table to store the results of the CTE. The runtime and the memory usage are significantly higher.
Predicate Push Down and CTE Inlining
The behaviour illustrated above is often referred to as "push predicate", "predicate push down" or "CTE inlining".
Predicate push down means that the query optimizer can move predicates around based on logical rules in order generate better execution plans.
CTE inlining is when the query optimizer decides to inline a CTE as a subquery which, as we've seen above, makes it possible to push the predicate. PostgreSQL is not inlining CTEs.
Discussions about the original decision to not inline CTEs and newer requests to change this behavior can be found in the PostgreSQL mailing lists.
Should I Stop Using CTE?
No! CTE are awesome and very useful. It's just important to be aware, especially in PostgreSQL, that CTEs are materialized. I found many cases where easy performance gains were achieved simply by inlining CTEs.