Home Explore Blog CI



postgresql

3rd chunk of `doc/src/sgml/storage.sgml`
b07922f6c838e1ba5514b9b6d6d048e2696dea352524ecff0000000100000fa0
 <structname>pg_class</structname>.<structfield>relfilenode</structfield>. But
for temporary relations, the file name is of the form
<literal>t<replaceable>BBB</replaceable>_<replaceable>FFF</replaceable></literal>, where <replaceable>BBB</replaceable>
is the process number of the backend which created the file, and <replaceable>FFF</replaceable>
is the filenode number.  In either case, in addition to the main file (a/k/a
main fork), each table and index has a <firstterm>free space map</firstterm> (see <xref
linkend="storage-fsm"/>), which stores information about free space available in
the relation.  The free space map is stored in a file named with the filenode
number plus the suffix <literal>_fsm</literal>.  Tables also have a
<firstterm>visibility map</firstterm>, stored in a fork with the suffix <literal>_vm</literal>,
to track which pages are known to have no dead tuples.  The visibility map is
described further in <xref linkend="storage-vm"/>.  Unlogged tables and indexes
have a third fork, known as the initialization fork, which is stored in a fork
with the suffix <literal>_init</literal> (see <xref linkend="storage-init"/>).
</para>

<caution>
<para>
Note that while a table's filenode often matches its OID, this is
<emphasis>not</emphasis> necessarily the case; some operations, like
<command>TRUNCATE</command>, <command>REINDEX</command>, <command>CLUSTER</command> and some forms
of <command>ALTER TABLE</command>, can change the filenode while preserving the OID.
Avoid assuming that filenode and table OID are the same.
Also, for certain system catalogs including <structname>pg_class</structname> itself,
<structname>pg_class</structname>.<structfield>relfilenode</structfield> contains zero.  The
actual filenode number of these catalogs is stored in a lower-level data
structure, and can be obtained using the <function>pg_relation_filenode()</function>
function.
</para>
</caution>

<para>
When a table or index exceeds 1 GB, it is divided into gigabyte-sized
<firstterm>segments</firstterm>.  The first segment's file name is the same as the
filenode; subsequent segments are named filenode.1, filenode.2, etc.
This arrangement avoids problems on platforms that have file size limitations.
(Actually, 1 GB is just the default segment size.  The segment size can be
adjusted using the configuration option <option>--with-segsize</option>
when building <productname>PostgreSQL</productname>.)
In principle, free space map and visibility map forks could require multiple
segments as well, though this is unlikely to happen in practice.
</para>

<para>
A table that has columns with potentially large entries will have an
associated <firstterm>TOAST</firstterm> table, which is used for out-of-line storage of
field values that are too large to keep in the table rows proper.
<structname>pg_class</structname>.<structfield>reltoastrelid</structfield> links from a table to
its <acronym>TOAST</acronym> table, if any.
See <xref linkend="storage-toast"/> for more information.
</para>

<para>
The contents of tables and indexes are discussed further in
<xref linkend="storage-page-layout"/>.
</para>

<para>
Tablespaces make the scenario more complicated.  Each user-defined tablespace
has a symbolic link inside the <varname>PGDATA</varname><filename>/pg_tblspc</filename>
directory, which points to the physical tablespace directory (i.e., the
location specified in the tablespace's <command>CREATE TABLESPACE</command> command).
This symbolic link is named after
the tablespace's OID.  Inside the physical tablespace directory there is
a subdirectory with a name that depends on the <productname>PostgreSQL</productname>
server version, such as <literal>PG_9.0_201008051</literal>.  (The reason for using
this subdirectory is so that successive versions of the database can use
the same <command>CREATE TABLESPACE</command> location value without conflicts.)
Within the version-specific subdirectory, there is
a subdirectory for each database that has elements in

Title: Filenodes, Free Space Maps, Visibility Maps, and Table Segmentation
Summary
This section delves into the details of how tables and indexes are stored, including the use of filenodes, free space maps (FSM), visibility maps (VM), and initialization forks. It cautions that filenodes and OIDs are not always the same and explains how tables are segmented into gigabyte-sized chunks. It also introduces the concept of TOAST tables for out-of-line storage of large field values. Tablespaces add complexity, with symbolic links in pg_tblspc pointing to the physical tablespace directory.