Home Explore Blog CI



postgresql

6th chunk of `doc/src/sgml/storage.sgml`
543281b35c8639159d9ecfaef6feef46cf9df665a5d47ff50000000100000fa5
 is significant compared to short values.
As a special case, if the remaining bits of a single-byte header are all
zero (which would be impossible for a self-inclusive length), the value is
a pointer to out-of-line data, with several possible alternatives as
described below.  The type and size of such a <firstterm>TOAST pointer</firstterm>
are determined by a code stored in the second byte of the datum.
Lastly, when the highest-order or lowest-order bit is clear but the adjacent
bit is set, the content of the datum has been compressed and must be
decompressed before use.  In this case the remaining bits of the four-byte
length word give the total size of the compressed datum, not the
original data.  Note that compression is also possible for out-of-line data
but the varlena header does not tell whether it has occurred &mdash;
the content of the <acronym>TOAST</acronym> pointer tells that, instead.
</para>

<para>
The compression technique used for either in-line or out-of-line compressed
data can be selected for each column by setting
the <literal>COMPRESSION</literal> column option in <command>CREATE
TABLE</command> or <command>ALTER TABLE</command>.  The default for columns
with no explicit setting is to consult the
<xref linkend="guc-default-toast-compression"/> parameter at the time data is
inserted.
</para>

<para>
As mentioned, there are multiple types of <acronym>TOAST</acronym> pointer datums.
The oldest and most common type is a pointer to out-of-line data stored in
a <firstterm><acronym>TOAST</acronym> table</firstterm> that is separate from, but
associated with, the table containing the <acronym>TOAST</acronym> pointer datum
itself.  These <firstterm>on-disk</firstterm> pointer datums are created by the
<acronym>TOAST</acronym> management code (in <filename>access/common/toast_internals.c</filename>)
when a tuple to be stored on disk is too large to be stored as-is.
Further details appear in <xref linkend="storage-toast-ondisk"/>.
Alternatively, a <acronym>TOAST</acronym> pointer datum can contain a pointer to
out-of-line data that appears elsewhere in memory.  Such datums are
necessarily short-lived, and will never appear on-disk, but they are very
useful for avoiding copying and redundant processing of large data values.
Further details appear in <xref linkend="storage-toast-inmemory"/>.
</para>

<sect2 id="storage-toast-ondisk">
 <title>Out-of-Line, On-Disk TOAST Storage</title>

<para>
If any of the columns of a table are <acronym>TOAST</acronym>-able, the table will
have an associated <acronym>TOAST</acronym> table, whose OID is stored in the table's
<structname>pg_class</structname>.<structfield>reltoastrelid</structfield> entry.  On-disk
<acronym>TOAST</acronym>ed values are kept in the <acronym>TOAST</acronym> table, as
described in more detail below.
</para>

<para>
Out-of-line values are divided (after compression if used) into chunks of at
most <symbol>TOAST_MAX_CHUNK_SIZE</symbol> bytes (by default this value is chosen
so that four chunk rows will fit on a page, making it about 2000 bytes).
Each chunk is stored as a separate row in the <acronym>TOAST</acronym> table
belonging to the owning table.  Every
<acronym>TOAST</acronym> table has the columns <structfield>chunk_id</structfield> (an OID
identifying the particular <acronym>TOAST</acronym>ed value),
<structfield>chunk_seq</structfield> (a sequence number for the chunk within its value),
and <structfield>chunk_data</structfield> (the actual data of the chunk).  A unique index
on <structfield>chunk_id</structfield> and <structfield>chunk_seq</structfield> provides fast
retrieval of the values.  A pointer datum representing an out-of-line on-disk
<acronym>TOAST</acronym>ed value therefore needs to store the OID of the
<acronym>TOAST</acronym> table in which to look and the OID of the specific value
(its <structfield>chunk_id</structfield>).  For convenience, pointer datums also store the
logical datum size (original uncompressed data length), physical stored

Title: TOAST Compression, Pointers, and On-Disk Storage
Summary
This section discusses how compression is applied to TOAST data and can be configured at the column level. It details the types of TOAST pointers, including those pointing to out-of-line data in a separate TOAST table for on-disk storage and those pointing to data elsewhere in memory. It then describes how out-of-line, on-disk TOASTed values are stored in chunks within the associated TOAST table, explaining the purpose of the chunk_id, chunk_seq, and chunk_data columns. The TOAST pointer stores the OID of the TOAST table and the specific value's chunk_id.