Initialization and Startup of Sample Scans

<structname>SampleScanState</structname> node has already been created, but its <structfield>tsm_state</structfield> field is NULL. The <function>InitSampleScan</function> function can palloc whatever internal state data is needed by the sampling method, and store a pointer to it in <literal>node->tsm_state</literal>. Information about the table to scan is accessible through other fields of the <structname>SampleScanState</structname> node (but note that the <literal>node->ss.ss_currentScanDesc</literal> scan descriptor is not set up yet). <literal>eflags</literal> contains flag bits describing the executor's operating mode for this plan node. </para> <para> When <literal>(eflags & EXEC_FLAG_EXPLAIN_ONLY)</literal> is true, the scan will not actually be performed, so this function should only do the minimum required to make the node state valid for <command>EXPLAIN</command> and <function>EndSampleScan</function>. </para> <para> This function can be omitted (set the pointer to NULL), in which case <function>BeginSampleScan</function> must perform all initialization needed by the sampling method. </para> <para> <programlisting> void BeginSampleScan (SampleScanState *node, Datum *params, int nparams, uint32 seed); </programlisting> Begin execution of a sampling scan. This is called just before the first attempt to fetch a tuple, and may be called again if the scan needs to be restarted. Information about the table to scan is accessible through fields of the <structname>SampleScanState</structname> node (but note that the <literal>node->ss.ss_currentScanDesc</literal> scan descriptor is not set up yet). The <literal>params</literal> array, of length <literal>nparams</literal>, contains the values of the parameters supplied in the <literal>TABLESAMPLE</literal> clause. These will have the number and types specified in the sampling method's <literal>parameterTypes</literal> list, and have been checked to not be null. <literal>seed</literal> contains a seed to use for any random numbers generated within the sampling method; it is either a hash derived from the <literal>REPEATABLE</literal> value if one was given, or the result of <literal>random()</literal> if not. </para> <para> This function may adjust the fields <literal>node->use_bulkread</literal> and <literal>node->use_pagemode</literal>. If <literal>node->use_bulkread</literal> is <literal>true</literal>, which it is by default, the scan will use a buffer access strategy that encourages recycling buffers after use. It might be reasonable to set this to <literal>false</literal> if the scan will visit only a small fraction of the table's pages. If <literal>node->use_pagemode</literal> is <literal>true</literal>, which it is by default, the scan will perform visibility checking in a single pass for all tuples on each visited page. It might be reasonable to set this to <literal>false</literal> if the scan will select only a small fraction of the tuples on each visited page. That will

This section focuses on the `InitSampleScan` and `BeginSampleScan` functions, which are crucial for setting up and starting a sampling scan. `InitSampleScan` initializes the scan, allocating internal state if necessary, while `BeginSampleScan` is called before the first tuple fetch, providing access to parameters and a random seed. `BeginSampleScan` also allows adjustments to buffer access and visibility checking strategies for optimization.