the standard normal distribution, with mean <literal>mu</literal>
defined as <literal>(max + min) / 2.0</literal>, with
<literallayout>
f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
(2.0 * PHI(parameter) - 1)
</literallayout>
then value <replaceable>i</replaceable> between <replaceable>min</replaceable> and
<replaceable>max</replaceable> inclusive is drawn with probability:
<literal>f(i + 0.5) - f(i - 0.5)</literal>.
Intuitively, the larger the <replaceable>parameter</replaceable>, the more
frequently values close to the middle of the interval are drawn, and the
less frequently values close to the <replaceable>min</replaceable> and
<replaceable>max</replaceable> bounds. About 67% of values are drawn from the
middle <literal>1.0 / parameter</literal>, that is a relative
<literal>0.5 / parameter</literal> around the mean, and 95% in the middle
<literal>2.0 / parameter</literal>, that is a relative
<literal>1.0 / parameter</literal> around the mean; for instance, if
<replaceable>parameter</replaceable> is 4.0, 67% of values are drawn from the
middle quarter (1.0 / 4.0) of the interval (i.e., from
<literal>3.0 / 8.0</literal> to <literal>5.0 / 8.0</literal>) and 95% from
the middle half (<literal>2.0 / 4.0</literal>) of the interval (second and third
quartiles). The minimum allowed <replaceable>parameter</replaceable>
value is 2.0.
</para>
</listitem>
<listitem>
<para>
<literal>random_zipfian</literal> generates a bounded Zipfian
distribution.
<replaceable>parameter</replaceable> defines how skewed the distribution
is. The larger the <replaceable>parameter</replaceable>, the more
frequently values closer to the beginning of the interval are drawn.
The distribution is such that, assuming the range starts from 1,
the ratio of the probability of drawing <replaceable>k</replaceable>
versus drawing <replaceable>k+1</replaceable> is
<literal>((<replaceable>k</replaceable>+1)/<replaceable>k</replaceable>)**<replaceable>parameter</replaceable></literal>.
For example, <literal>random_zipfian(1, ..., 2.5)</literal> produces
the value <literal>1</literal> about <literal>(2/1)**2.5 =
5.66</literal> times more frequently than <literal>2</literal>, which
itself is produced <literal>(3/2)**2.5 = 2.76</literal> times more
frequently than <literal>3</literal>, and so on.
</para>
<para>
<application>pgbench</application>'s implementation is based on
"Non-Uniform Random Variate Generation", Luc Devroye, p. 550-551,
Springer 1986. Due to limitations of that algorithm,
the <replaceable>parameter</replaceable> value is restricted to
the range [1.001, 1000].
</para>
</listitem>
</itemizedlist>
<note>
<para>
When designing a benchmark which selects rows non-uniformly, be aware
that the rows chosen may be correlated with other data such as IDs from
a sequence or the physical row ordering, which may skew performance
measurements.
</para>
<para>
To avoid this, you may wish to use the <function>permute</function>
function, or some other additional step with similar effect, to shuffle
the selected rows and remove such correlations.
</para>
</note>
<para>
Hash functions <literal>hash</literal>, <literal>hash_murmur2</literal> and
<literal>hash_fnv1a</literal> accept an input value and an optional seed parameter.
In case the seed isn't provided the value of <literal>:default_seed</literal>
is used, which is initialized randomly unless set by the command-line
<literal>-D</literal> option.
</para>
<para>
<literal>permute</literal> accepts an input value, a size, and an optional
seed parameter. It generates a pseudorandom permutation of integers in
the range <literal>[0, size)</literal>,