Optimization and Block Selection in Sample Scans

method; it is either a hash derived from the <literal>REPEATABLE</literal> value if one was given, or the result of <literal>random()</literal> if not. </para> <para> This function may adjust the fields <literal>node->use_bulkread</literal> and <literal>node->use_pagemode</literal>. If <literal>node->use_bulkread</literal> is <literal>true</literal>, which it is by default, the scan will use a buffer access strategy that encourages recycling buffers after use. It might be reasonable to set this to <literal>false</literal> if the scan will visit only a small fraction of the table's pages. If <literal>node->use_pagemode</literal> is <literal>true</literal>, which it is by default, the scan will perform visibility checking in a single pass for all tuples on each visited page. It might be reasonable to set this to <literal>false</literal> if the scan will select only a small fraction of the tuples on each visited page. That will result in fewer tuple visibility checks being performed, though each one will be more expensive because it will require more locking. </para> <para> If the sampling method is marked <literal>repeatable_across_scans</literal>, it must be able to select the same set of tuples during a rescan as it did originally, that is a fresh call of <function>BeginSampleScan</function> must lead to selecting the same tuples as before (if the <literal>TABLESAMPLE</literal> parameters and seed don't change). </para> <para> <programlisting> BlockNumber NextSampleBlock (SampleScanState *node, BlockNumber nblocks); </programlisting> Returns the block number of the next page to be scanned, or <literal>InvalidBlockNumber</literal> if no pages remain to be scanned. </para> <para> This function can be omitted (set the pointer to NULL), in which case the core code will perform a sequential scan of the entire relation. Such a scan can use synchronized scanning, so that the sampling method cannot assume that the relation pages are visited in the same order on

This section discusses further optimization options within `BeginSampleScan`, specifically adjusting `use_bulkread` and `use_pagemode` based on the expected fraction of table pages and tuples to be scanned. It also covers the `repeatable_across_scans` property, ensuring consistent tuple selection across rescans. Finally, it introduces `NextSampleBlock`, a function that allows sampling methods to specify the next block to scan, offering control over the scanning order, or defaulting to a sequential scan if omitted.