The Surprising Impact of Medium-Size Texts on PostgreSQL Performance

Why TOAST is the best thing since sliced bread


Any database schema is likely to have plenty of text fields. In this article, I divide text fields into three categories:

  1. Small texts: names, slugs, usernames, emails, etc. These are text fields that usually have some low size limit, maybe even using varchar(n) and not text.

  2. Large texts: blog post content, articles, HTML content etc. These are large pieces of free, unrestricted text that is stored in the database.

  3. Medium texts: descriptions, comments, product reviews, stack traces etc. These are any text field that is between the small and the large. These type of texts would normally be unrestricted, but naturally smaller than the large texts.

In this article I demonstrate the surprising impact of medium-size texts on query performance in PostgreSQL.

Sliced bread... it gets better<br><small>Photo by <a href="https://unsplash.com/photos/WHJTaLqonkU">Louise Lyshøj</a></small>
Sliced bread... it gets better
Photo by Louise Lyshøj

Table of Contents


Understanding TOAST

When talking about large chunks of text, or any other field that may contain large amounts of data, we first need to understand how the database handles the data. Intuitively, you might think that the database is storing large pieces of data inline like it does smaller pieces of data, but in fact, it does not:

PostgreSQL uses a fixed page size (commonly 8 kB), and does not allow tuples to span multiple pages. Therefore, it is not possible to store very large field values directly.

As the documentation explains, PostgreSQL can't store rows (tuples) in multiple pages. So how does the database store large chunks of data?

[...] large field values are compressed and/or broken up into multiple physical rows. [...] The technique is affectionately known as TOAST (or “the best thing since sliced bread”).

OK, so how is this TOAST working exactly?

If any of the columns of a table are TOAST-able, the table will have an associated TOAST table

So TOAST is a separate table associated with our table. It is used to store large pieces of data of TOAST-able columns (the text datatype for example, is TOAST-able).

What constitutes a large value?

The TOAST management code is triggered only when a row value to be stored in a table is wider than TOAST_TUPLE_THRESHOLD bytes (normally 2 kB). The TOAST code will compress and/or move field values out-of-line until the row value is shorter than TOAST_TUPLE_TARGET bytes (also normally 2 kB, adjustable) or no more gains can be had

PostgreSQL will try to compress a the large values in the row, and if the row can't fit within the limit, the values will be stored out-of-line in the TOAST table.

Finding the TOAST

Now that we have some understanding of what TOAST is, let's see it in action. First, create a table with a text field:

db=# CREATE TABLE toast_test (id SERIAL, value TEXT);
CREATE TABLE

The table contains an id column, and a value field of type TEXT. Notice that we did not change any of the default storage parameters.

The text field we added supports TOAST, or is TOAST-able, so PostgreSQL should create a TOAST table. Let's try to locate the TOAST table associated with the table toast_test in pg_class:

db=# SELECT relname, reltoastrelid FROM pg_class WHERE relname = 'toast_test';
  relname   │ reltoastrelid
────────────┼───────────────
 toast_test │        340488

db=# SELECT relname FROM pg_class WHERE oid = 340488;
     relname
─────────────────
 pg_toast_340484

As promised, PostgreSQL created a TOAST table called pg_toast_340484.

TOAST in Action

Let's see what the TOAST table looks like:

db=# \d pg_toast.pg_toast_340484
TOAST table "pg_toast.pg_toast_340484"
   Column   │  Type
────────────┼─────────
 chunk_id   │ oid
 chunk_seq  │ integer
 chunk_data │ bytea

The TOAST table contains three columns:

  • chunk_id: A reference to a toasted value.
  • chunk_seq: A sequence within the chunk.
  • chunk_data: The actual chunk data.

Similar to "regular" tables, the TOAST table also has the same restrictions on inline values. To overcome this restriction, large values are split into chunks that can fit within the limit.

At this point the table is empty:

db=# SELECT * FROM pg_toast.pg_toast_340484;
 chunk_id │ chunk_seq │ chunk_data
──────────┼───────────┼────────────
(0 rows)

This makes sense because we did not insert any data yet. So next, insert a small value into the table:

db=# INSERT INTO toast_test (value) VALUES ('small value');
INSERT 0 1

db=# SELECT * FROM pg_toast.pg_toast_340484;
 chunk_id │ chunk_seq │ chunk_data
──────────┼───────────┼────────────
(0 rows)

After inserting the small value into the table, the TOAST table remained empty. This means the small value was small enough to be stored inline, and there was no need to move it out-of-line to the TOAST table.

1"small value"idvalue
Small text stored inline

Let's insert a large value and see what happens:

db=# INSERT INTO toast_test (value) VALUES ('n0cfPGZOCwzbHSMRaX8 ... WVIlRkylYishNyXf');
INSERT 0 1

I shortened the value for brevity, but that's a random string with 4096 characters. Let's see what the TOAST table stores now:

db=# SELECT * FROM pg_toast.pg_toast_340484;
 chunk_id │ chunk_seq │ chunk_data
──────────┼───────────┼──────────────────────
   995899 │         0 │ \x30636650475a4f43...
   995899 │         1 │ \x50714c3756303567...
   995899 │         2 │ \x6c78426358574534...
(3 rows)

The large value is stored out-of-line in the TOAST table. Because the value was too large to fit inline in a single row, PostgreSQL split it into three chunks. The \x3063... notation is how psql displays binary data.

1"small value"2213\x.....\x.....\x.....idvalue
Large text stored out-of-line, in the associated TOAST table

Finally, execute the following query to summarize the data in the TOAST table:

db=# SELECT chunk_id, COUNT(*) as chunks, pg_size_pretty(sum(octet_length(chunk_data)::bigint))
FROM pg_toast.pg_toast_340484 GROUP BY 1 ORDER BY 1;
 chunk_id │ chunks │ pg_size_pretty
──────────┼────────┼────────────────
   995899 │      3 │ 4096 bytes
(1 row)

As we've already seen, the text is stored in three chunks.

size of database objects

There are several ways to get the size of database objects in PostgreSQL:

  • pg_table_size: Get the size of the table including TOAST, but excluding indexes
  • pg_relation_size: Get the size of just the table
  • pg_total_relation_size: Get the size of the table, including indexes and TOAST

Another useful function is pg_size_pretty: used to display sizes in a friendly format.

TOAST Compression

So far I refrained from categorizing texts by their size. The reason for that is that the size of the text itself does not matter, what matters is its size after compression.

To create long strings for testing, we'll implement a function to generate random strings at a given length:

CREATE OR REPLACE FUNCTION generate_random_string(
  length INTEGER,
  characters TEXT default '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
) RETURNS TEXT AS
$$
DECLARE
  result TEXT := '';
BEGIN
  IF length < 1 then
      RAISE EXCEPTION 'Invalid length';
  END IF;
  FOR __ IN 1..length LOOP
    result := result || substr(characters, floor(random() * length(characters))::int + 1, 1);
  end loop;
  RETURN result;
END;
$$ LANGUAGE plpgsql;

Generate a string made out of 10 random characters:

db=# SELECT generate_random_string(10);
 generate_random_string
────────────────────────
 o0QsrMYRvp

We can also provide a set of characters to generate the random string from. For example, generate a string made of 10 random digits:

db=# SELECT generate_random_string(10, '1234567890');
 generate_random_string
────────────────────────
 4519991669

PostgreSQL TOAST uses the LZ family of compression techniques. Compression algorithms usually work by identifying and eliminating repetition in the value. A long string containing fewer characters should compress very well compared to a string made of many different characters when encoded into bytes.

To illustrate how TOAST uses compression, we'll clean out the toast_test table, and insert a random string made of many possible characters:

db=# TRUNCATE toast_test;
TRUNCATE TABLE

db=# INSERT INTO toast_test (value) VALUES (generate_random_string(1024 * 10));
INSERT 0 1

We inserted a 10kb value made of random characters. Let's check the TOAST table:

db=# SELECT chunk_id, COUNT(*) as chunks, pg_size_pretty(sum(octet_length(chunk_data)::bigint))
FROM pg_toast.pg_toast_340484 GROUP BY 1 ORDER BY 1;

 chunk_id │ chunks │ pg_size_pretty
──────────┼────────┼────────────────
  1495960 │      6 │ 10 kB

The value is stored out-of-line in the TOAST table, and we can see it is not compressed.

Next, insert a value with a similar length, but made out of fewer possible characters:

db=# INSERT INTO toast_test (value) VALUES (generate_random_string(1024 * 10, '123'));
INSERT 0 1

db=# SELECT chunk_id, COUNT(*) as chunks, pg_size_pretty(sum(octet_length(chunk_data)::bigint))
FROM pg_toast.pg_toast_340484 GROUP BY 1 ORDER BY 1;

 chunk_id │ chunks │ pg_size_pretty
──────────┼────────┼────────────────
  1495960 │      6 │ 10 kB
  1495961 │      2 │ 3067 bytes

We inserted a 10K value, but this time it only contained 3 possible digits: 1, 2 and 3. This text is more likely to contain repeating binary patterns, and should compress better than the previous value. Looking at the TOAST, we can see PostgreSQL compressed the value to ~3kB, which is a third of the size of the uncompressed value. Not a bad compression rate!

Finally, insert a 10K long string made of a single digit:

db=# insert into toast_test (value) values (generate_random_string(1024 * 10, '0'));
INSERT 0 1

db=# SELECT chunk_id, COUNT(*) as chunks, pg_size_pretty(sum(octet_length(chunk_data)::bigint))
FROM pg_toast.pg_toast_340484 GROUP BY 1 ORDER BY 1;

 chunk_id │ chunks │ pg_size_pretty
──────────┼────────┼────────────────
  1495960 │      6 │ 10 kB
  1495961 │      2 │ 3067 bytes

The string was compressed so well, that the database was able to store it in-line.

Configuring TOAST

If you are interested in configuring TOAST for a table you can do that by setting storage parameters at CREATE TABLE or ALTER TABLE ... SET STORAGE. The relevant parameters are:

  • toast_tuple_target: The minimum tuple length after which PostgreSQL tries to move long values to TOAST.
  • storage: The TOAST strategy. PostgreSQL supports 4 different TOAST strategies. The default is EXTENDED, which means PostgreSQL will try to compress the value and store it out-of-line.

I personally never had to change the default TOAST storage parameters.


TOAST Performance

To understand the effect of different text sizes and out-of-line storage on performance, we'll create three tables, one for each type of text:

db=# CREATE TABLE toast_test_small (id SERIAL, value TEXT);
CREATE TABLE

db=# CREATE TABLE toast_test_medium (id SERIAL, value TEXT);
CREATE TABLE

db=# CREATE TABLE toast_test_large (id SERIAL, value TEXT);
CREATE TABLE

Like in the previous section, for each table PostgreSQL created a TOAST table:

SELECT
    c1.relname,
    c2.relname AS toast_relname
FROM
    pg_class c1
    JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
WHERE
    c1.relname LIKE 'toast_test%'
    AND c1.relkind = 'r';

      relname      │  toast_relname
───────────────────┼─────────────────
 toast_test_small  │ pg_toast_471571
 toast_test_medium │ pg_toast_471580
 toast_test_large  │ pg_toast_471589

Set Up Test Data

First, let's populate toast_test_small with 500K rows containing a small text that can be stored inline:

db=# INSERT INTO toast_test_small (value)
SELECT 'small value' FROM generate_series(1, 500000);
INSERT 0 500000

Next, populate the toast_test_medium with 500K rows containing texts that are at the border of being stored out-of-line, but still small enough to be stored inline:

db=# WITH str AS (SELECT generate_random_string(1800) AS value)
INSERT INTO toast_test_medium (value)
SELECT value
FROM generate_series(1, 500000), str;
INSERT 0 500000

I experimented with different values until I got a value just large enough to be stored out-of-line. The trick is to find a string which is roughly 2K that compresses very poorly.

Next, insert 500K rows with large texts to toast_test_large:

db=# WITH str AS (SELECT generate_random_string(4096) AS value)
INSERT INTO toast_test_large (value)
SELECT value
FROM generate_series(1, 500000), str;
INSERT 0 500000

We are now ready for the next step.

Comparing Performance

We usually expect queries on large tables to be slower than queries on smaller tables. In this case, it's not unreasonable to expect the query on the small tables to run faster than on the medium table, and a query on the medium table to be faster than the same query on the large table.

To compare performance, we are going to execute a simple query to fetch one row from the table. Since we don't have an index, the database is going to perform a full table scan. We'll also disable parallel query execution to get a clean, simple timing, and execute the query multiple times to account for caching.

db=# SET max_parallel_workers_per_gather = 0;
SET

Starting with the small table:

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_small WHERE id = 6000;
                                    QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────
Seq Scan on toast_test_small (cost=0.00..8953.00 rows=1 width=16
  Filter: (id = 6000)
  Rows Removed by Filter: 499999
Execution Time: 41.513 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_small WHERE id = 6000;
Execution Time: 25.865 ms

I ran the query multiple times and trimmed the output for brevity. As expected the database performed a full table scan, and the timing finally settled on ~25ms.

Next, execute the same query on the medium table:

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_medium WHERE id = 6000;
Execution Time: 321.965 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_medium WHERE id = 6000;
Execution Time: 173.058 ms

Running the exact same query on the medium table took significantly more time, 173ms, which is roughly 6x slower than on the smaller table. This makes sense.

To complete the test, run the query again on the large table:

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_large WHERE id = 6000;
Execution Time: 49.867 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_large WHERE id = 6000;
Execution Time: 37.291 ms

Well, this is surprising! The timing of the query on the large table is similar to the timing of the small table, and 6 times faster than the medium table.

Table Timing
toast_test_small 31.323 ms
toast_test_medium 173.058 ms
toast_test_large 37.291 ms

Large tables are supposed to be slower, so what is going on?

Making Sense of the Results

To make sense of the results, have a look at the size of each table, and the size of its associated TOAST table:

SELECT
    c1.relname,
    pg_size_pretty(pg_relation_size(c1.relname::regclass)) AS size,
    c2.relname AS toast_relname,
    pg_size_pretty(pg_relation_size(('pg_toast.' || c2.relname)::regclass)) AS toast_size
FROM
    pg_class c1
    JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
WHERE
    c1.relname LIKE 'toast_test_%'
    AND c1.relkind = 'r';
relname size toast_relname toast_size
toast_test_small 21 MB pg_toast_471571 0 bytes
toast_test_medium 977 MB pg_toast_471580 0 bytes
toast_test_large 25 MB pg_toast_471589 1953 MB

Let's break it down:

  • toast_test_small: The size of the table is 21MB, and there is no TOAST. This makes sense because the texts we inserted to that table were small enough to be stored inline.
1.....2idvalue500K..........
Small texts stored inline
  • toast_test_medium: The table is significantly larger, 977MB. We inserted text values that were just small enough to be stored inline. As a result, the table got very big, and the TOAST was not used at all.
1.................................................................2idvalue500K..................................................................................................................................
Medium texts stored inline
  • toast_test_large: The size of the table is roughly similar to the size of the small table. This is because we inserted large texts into the table, and PostgreSQL stored them out-of-line in the TOAST table. This is why the TOAST table is so big for the large table, but the table itself remained small.
12idvalue500K1\x.....1\x.....2\x.....2\x.....500K500K\x.....\x.....
Large texts stored out-of-line in TOAST

When we executed our query, the database did a full table scan. To scan the small and large tables, the database only had to read 21MB and 25MB and the query was pretty fast. However, when we executed the query against the medium table, where all the texts are stored inline, the database had to read 977MB from disk, and the query took a lot longer.

TAKE AWAY

TOAST is a great way of keeping tables compact by storing large values out-of-line!

Using the Text Values

In the previous comparison we executed a query that only used the ID, not the text value. What will happen when we actually need to access the text value itself?

db=# \timing
Timing is on.

db=# SELECT * FROM toast_test_large WHERE value LIKE 'foo%';
Time: 7509.900 ms (00:07.510)

db=# SELECT * FROM toast_test_large WHERE value LIKE 'foo%';
Time: 7290.925 ms (00:07.291)

db=# SELECT * FROM toast_test_medium WHERE value LIKE 'foo%';
Time: 5869.631 ms (00:05.870)

db=# SELECT * FROM toast_test_medium WHERE value LIKE 'foo%';
Time: 259.970 ms

db=# SELECT * FROM toast_test_small WHERE value LIKE 'foo%';
Time: 78.897 ms

db=# SELECT * FROM toast_test_small WHERE value LIKE 'foo%';
Time: 50.035 ms

We executed a query against all three tables to search for a string within the text value. The query is not expected to return any results, and is forced to scan the entire table. This time, the results are more consistent with what we would expect:

Table Cold cache Warm cache
toast_test_small 78.897 ms 50.035 ms
toast_test_medium 5869.631 ms 259.970 ms
toast_test_large 7509.900 ms 7290.925 ms

The larger the table, the longer it took the query to complete. This makes sense because to satisfy the query, the database was forced to read the texts as well. In the case of the large table, this means accessing the TOAST table as well.

What About Indexes?

Indexes help the database minimize the number of pages it needs to fetch to satisfy a query. For example, let's take the first example when we searched for a single row by ID, but this time we'll have an index on the field:

db=# CREATE INDEX toast_test_medium_id_ix ON toast_test_small(id);
CREATE INDEX

db=# CREATE INDEX toast_test_medium_id_ix ON toast_test_medium(id);
CREATE INDEX

db=# CREATE INDEX toast_test_large_id_ix ON toast_test_large(id);
CREATE INDEX

Executing the exact same query as before with indexes on the tables:

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_small WHERE id = 6000;
                                QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────────────
Index Scan using toast_test_small_id_ix on toast_test_small(cost=0.42..8.44 rows=1 width=16)
  Index Cond: (id = 6000)
Time: 0.772 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_medium WHERE id = 6000;
                                QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────────────
Index Scan using toast_test_medium_id_ix on toast_test_medium(cost=0.42..8.44 rows=1 width=1808
  Index Cond: (id = 6000)
Time: 0.831 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_large WHERE id = 6000;
                                QUERY PLAN
─────────────────────────────────────────────────────────────────────────────────────────────
Index Scan using toast_test_large_id_ix on toast_test_large(cost=0.42..8.44 rows=1 width=22)
  Index Cond: (id = 6000)
Time: 0.618 ms

In all three cases the index was used, and we see that the performance in all three cases is almost identical.

By now, we know that the trouble begins when the database has to do a lot of IO. So next, let's craft a query that the database will choose to use the index for, but will still have to read a lot of data:

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_small WHERE id BETWEEN 0 AND 250000;
                                QUERY PLAN
───────────────────────────────────────────────────────────────────────────────────────────────
Index Scan using toast_test_small_id_ix on toast_test_small(cost=0.4..9086 rows=249513 width=16
  Index Cond: ((id >= 0) AND (id <= 250000))
Time: 60.766 ms
db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_small WHERE id BETWEEN 0 AND 250000;
Time: 59.705 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_medium WHERE id BETWEEN 0 AND 250000;
Time: 3198.539 ms (00:03.199)
db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_medium WHERE id BETWEEN 0 AND 250000;
Time: 284.339 ms

db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_large WHERE id BETWEEN 0 AND 250000;
Time: 85.747 ms
db=# EXPLAIN (ANALYZE, TIMING) SELECT * FROM toast_test_large WHERE id BETWEEN 0 AND 250000;
Time: 70.364 ms

We executed a query that fetch half the data in the table. This was a low enough portion of table to make PostgreSQL decide to use the index, but still high enough to require lots of IO.

see also

To understand why using an index is not always the fastest plan, see my tip on avoiding indexes on columns with low selectivity.

We ran each query twice on each table. In all cases the database used the index to access the table. Keep in mind that the index only helps reduce the number of pages the database has to access, but in this case, the database still had to read half the table.

Table Cold cache Warm cache
toast_test_small 60.766 ms 59.705 ms
toast_test_medium 3198.539 ms 284.339 ms
toast_test_large 85.747 ms 70.364 ms

The results here are similar to the first test we ran. When the database had to read a large portion of the table, the medium table, where the texts are stored inline, was the slowest.

Possible Solutions

If after reading so far, you are convinced that medium-size texts are what's causing you performance issues, there are things you can do.

Adjusting toast_tuple_target

toast_tuple_target is a storage parameter that controls the minimum tuple length after which PostgreSQL tries to move long values to TOAST. The default is 2K, but it can be decreased to a minimum of 128 bytes. The lower the target, the more chances are for a medium size string to be move out-of-line to the TOAST table.

To demonstrate, create a table with the default storage params, and another with toast_tuple_target = 128:

db=# CREATE TABLE toast_test_default_threshold (id SERIAL, value TEXT);
CREATE TABLE

db=# CREATE TABLE toast_test_128_threshold (id SERIAL, value TEXT) WITH (toast_tuple_target=128);
CREATE TABLE

db=# SELECT c1.relname, c2.relname AS toast_relname
FROM pg_class c1 JOIN pg_class c2 ON c1.reltoastrelid = c2.oid
WHERE c1.relname LIKE 'toast%threshold' AND c1.relkind = 'r';

           relname            │  toast_relname
──────────────────────────────┼──────────────────
 toast_test_default_threshold │ pg_toast_3250167
 toast_test_128_threshold     │ pg_toast_3250176

Next, generate a value larger than 2KB that compresses to less than 128 bytes, insert to both tables, and check if it was stored out-of-line or not:

db=# INSERT INTO toast_test_default_threshold (value) VALUES (generate_random_string(2100, '123'));
INSERT 0 1

db=# SELECT * FROM pg_toast.pg_toast_3250167;
 chunk_id │ chunk_seq │ chunk_data
──────────┼───────────┼────────────
(0 rows)

db=# INSERT INTO toast_test_128_threshold (value) VALUES (generate_random_string(2100, '123'));
INSERT 0 1

db=# SELECT * FROM pg_toast.pg_toast_3250176;
─[ RECORD 1 ]─────────────
chunk_id   │ 3250185
chunk_seq  │ 0
chunk_data │ \x3408.......

The (roughly) similar medium-size text was stored inline with the default params, and out-of-line with a lower toast_tuple_target.

Create a Separate Table

If you have a critical table that stores medium-size text fields, and you notice that most texts are being stored inline and perhaps slowing down queries, you can move the column with the medium text field into its own table:

CREATE TABLE toast_test_value (fk INT, value TEXT);
CREATE TABLE toast_test (id SERIAL, value_id INT)

In my previous article I demonstrated how we use SQL to find anomalies. In one of those use cases, we actually had a table of errors that contained a python traceback. The error messages were medium texts, many of them stored in-line, and as a result the table got big very quickly! So big in fact, that we noticed queries are getting slower and slower. Eventually we moved the errors into a separate table, and things got much faster!


Conclusion

The main problem with medium-size texts is that they make the rows very wide. This is a problem because PostgreSQL, as well as other OLTP oriented databases, are storing values in rows. When we ask the database to execute a query with only a few columns, the values of these columns are most likely spread across many blocks. If the rows are wide, this translates into a lot of IO, which affect the query performance and resource usage.

To overcome this challenge, some non-OLTP oriented databases are using a different type of storage: columnar storage. Using columnar storage, data is stored on disk by columns, not by rows. This way, when the database has to scan a specific column, the values are stored in consecutive blocks, and it usually translated to less IO. Additionally, values of a specific columns are more likely to have repeating patterns and values, so they are better compressed.

2.....idvalue1.....3.....2.....id1.....3.....value
Row vs Column Storage

For non-OLTP payloads such as data warehouse systems, this makes sense. The tables are usually very wide, and queries often use a small subset of the columns, and read a lot of rows. In OLTP payloads, the system will usually read one or very few rows, so storing data in rows makes more sense.

There has been chatter about pluggable storage in PostgreSQL, so this is something to look out for!




Similar articles