Expanded TOAST Pointers and Free Space Map (FSM)

as long as the pointer could exist, and there is no infrastructure to help with this. </para> <para> Expanded <acronym>TOAST</acronym> pointers are useful for complex data types whose on-disk representation is not especially suited for computational purposes. As an example, the standard varlena representation of a <productname>PostgreSQL</productname> array includes dimensionality information, a nulls bitmap if there are any null elements, then the values of all the elements in order. When the element type itself is variable-length, the only way to find the <replaceable>N</replaceable>'th element is to scan through all the preceding elements. This representation is appropriate for on-disk storage because of its compactness, but for computations with the array it's much nicer to have an <quote>expanded</quote> or <quote>deconstructed</quote> representation in which all the element starting locations have been identified. The <acronym>TOAST</acronym> pointer mechanism supports this need by allowing a pass-by-reference Datum to point to either a standard varlena value (the on-disk representation) or a <acronym>TOAST</acronym> pointer that points to an expanded representation somewhere in memory. The details of this expanded representation are up to the data type, though it must have a standard header and meet the other API requirements given in <filename>src/include/utils/expandeddatum.h</filename>. C-level functions working with the data type can choose to handle either representation. Functions that do not know about the expanded representation, but simply apply <function>PG_DETOAST_DATUM</function> to their inputs, will automatically receive the traditional varlena representation; so support for an expanded representation can be introduced incrementally, one function at a time. </para> <para> <acronym>TOAST</acronym> pointers to expanded values are further broken down into <firstterm>read-write</firstterm> and <firstterm>read-only</firstterm> pointers. The pointed-to representation is the same either way, but a function that receives a read-write pointer is allowed to modify the referenced value in-place, whereas one that receives a read-only pointer must not; it must first create a copy if it wants to make a modified version of the value. This distinction and some associated conventions make it possible to avoid unnecessary copying of expanded values during query execution. </para> <para> For all types of in-memory <acronym>TOAST</acronym> pointer, the <acronym>TOAST</acronym> management code ensures that no such pointer datum can accidentally get stored on disk. In-memory <acronym>TOAST</acronym> pointers are automatically expanded to normal in-line varlena values before storage — and then possibly converted to on-disk <acronym>TOAST</acronym> pointers, if the containing tuple would otherwise be too big. </para> </sect2> </sect1> <sect1 id="storage-fsm"> <title>Free Space Map</title> <indexterm> <primary>Free Space Map</primary> </indexterm> <indexterm><primary>FSM</primary><see>Free Space Map</see></indexterm> <para> Each heap and index relation, except for hash indexes, has a Free Space Map (<acronym>FSM</acronym>) to keep track of available space in the relation. It's stored alongside the main relation data in a separate relation fork, named after the filenode number of the relation, plus a <literal>_fsm</literal> suffix. For example, if the filenode of a relation is 12345, the <acronym>FSM</acronym> is stored in a file called <filename>12345_fsm</filename>, in the same directory as the main relation file. </para> <para> The Free Space Map is organized as a tree of <acronym>FSM</acronym> pages. The bottom level <acronym>FSM</acronym> pages store the free space available on each heap (or index) page, using one byte to represent each such page. The upper levels aggregate information from the lower levels. </para> <para> Within each <acronym>FSM</acronym> page is a binary tree, stored in an array with

This section details expanded TOAST pointers, used for complex data types to provide an expanded or deconstructed representation in memory, making computations easier. It distinguishes between read-write and read-only pointers to optimize copying during query execution. It also explains that in-memory TOAST pointers are automatically expanded before storage. The discussion then shifts to the Free Space Map (FSM), which tracks available space in heap and index relations using a tree structure, stored in separate files, for efficient space management.