TOAST Advantages and In-Memory Storage

to fetch only the required parts of the out-of-line value when it is not compressed. </para> </listitem> <listitem> <para> <literal>MAIN</literal> allows compression but not out-of-line storage. (Actually, out-of-line storage will still be performed for such columns, but only as a last resort when there is no other way to make the row small enough to fit on a page.) </para> </listitem> </itemizedlist> Each <acronym>TOAST</acronym>-able data type specifies a default strategy for columns of that data type, but the strategy for a given table column can be altered with <link linkend="sql-altertable"><command>ALTER TABLE ... SET STORAGE</command></link>. </para> <para> <symbol>TOAST_TUPLE_TARGET</symbol> can be adjusted for each table using <link linkend="sql-altertable"><command>ALTER TABLE ... SET (toast_tuple_target = N)</command></link> </para> <para> This scheme has a number of advantages compared to a more straightforward approach such as allowing row values to span pages. Assuming that queries are usually qualified by comparisons against relatively small key values, most of the work of the executor will be done using the main row entry. The big values of <acronym>TOAST</acronym>ed attributes will only be pulled out (if selected at all) at the time the result set is sent to the client. Thus, the main table is much smaller and more of its rows fit in the shared buffer cache than would be the case without any out-of-line storage. Sort sets shrink also, and sorts will more often be done entirely in memory. A little test showed that a table containing typical HTML pages and their URLs was stored in about half of the raw data size including the <acronym>TOAST</acronym> table, and that the main table contained only about 10% of the entire data (the URLs and some small HTML pages). There was no run time difference compared to an un-<acronym>TOAST</acronym>ed comparison table, in which all the HTML pages were cut down to 7 kB to fit. </para> </sect2> <sect2 id="storage-toast-inmemory"> <title>Out-of-Line, In-Memory TOAST Storage</title> <para> <acronym>TOAST</acronym> pointers can point to data that is not on disk, but is elsewhere in the memory of the current server process. Such pointers obviously cannot be long-lived, but they are nonetheless useful. There are currently two sub-cases: pointers to <firstterm>indirect</firstterm> data and pointers to <firstterm>expanded</firstterm> data. </para> <para> Indirect <acronym>TOAST</acronym> pointers simply point at a non-indirect varlena value stored somewhere in memory. This case was originally created merely as a proof of concept, but it is currently used during logical decoding to avoid possibly having to create physical tuples exceeding 1 GB (as pulling all out-of-line field values into the tuple might do). The case is of limited use since the creator of the pointer datum is entirely responsible that the referenced data survives for as long as the pointer could exist, and there is no infrastructure to help with this. </para> <para> Expanded <acronym>TOAST</acronym> pointers are useful for complex data types whose on-disk representation is not especially suited for computational purposes. As an example, the standard varlena representation of a <productname>PostgreSQL</productname> array includes dimensionality information, a nulls bitmap if there are any null elements, then the values of all the elements in order. When the element type itself is variable-length, the only way to find the <replaceable>N</replaceable>'th element is to scan through all the preceding elements. This representation is appropriate for on-disk storage because of its compactness, but for computations with the array it's much nicer to have an <quote>expanded</quote> or <quote>deconstructed</quote> representation in which all the element starting locations have been identified. The <acronym>TOAST</acronym> pointer mechanism supports this

This section discusses the advantages of the TOAST scheme, such as smaller main tables, more rows fitting in the shared buffer cache, and smaller sort sets. It also describes in-memory TOAST storage, including indirect pointers, which point to non-indirect varlena values in memory (used in logical decoding), and expanded pointers, useful for complex data types like arrays, where an expanded representation facilitates computation.