Home Explore Blog CI



postgresql

27th chunk of `doc/src/sgml/ref/pgbench.sgml`
3c4c6dd4d5428fda74ade67751a1b856f68519a435392ec60000000100000fa0
 <para>
        Computes a Zipfian-distributed random integer in <literal>[lb,
        ub]</literal>, see below.
       </para>
       <para>
        <literal>random_zipfian(1, 10, 1.5)</literal>
        <returnvalue>an integer between 1 and 10</returnvalue>
       </para></entry>
      </row>

      <row>
       <entry role="func_table_entry"><para role="func_signature">
        <function>sqrt</function> ( <replaceable>number</replaceable> )
        <returnvalue>double</returnvalue>
       </para>
       <para>
        Square root
       </para>
       <para>
        <literal>sqrt(2.0)</literal>
        <returnvalue>1.414213562</returnvalue>
       </para></entry>
      </row>
     </tbody>
    </tgroup>
   </table>

   <para>
    The <literal>random</literal> function generates values using a uniform
    distribution, that is all the values are drawn within the specified
    range with equal probability. The <literal>random_exponential</literal>,
    <literal>random_gaussian</literal> and <literal>random_zipfian</literal>
    functions require an additional double parameter which determines the precise
    shape of the distribution.
   </para>

   <itemizedlist>
    <listitem>
     <para>
      For an exponential distribution, <replaceable>parameter</replaceable>
      controls the distribution by truncating a quickly-decreasing
      exponential distribution at <replaceable>parameter</replaceable>, and then
      projecting onto integers between the bounds.
      To be precise, with
<literallayout>
f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1 - exp(-parameter))
</literallayout>
      Then value <replaceable>i</replaceable> between <replaceable>min</replaceable> and
      <replaceable>max</replaceable> inclusive is drawn with probability:
      <literal>f(i) - f(i + 1)</literal>.
     </para>

     <para>
      Intuitively, the larger the <replaceable>parameter</replaceable>, the more
      frequently values close to <replaceable>min</replaceable> are accessed, and the
      less frequently values close to <replaceable>max</replaceable> are accessed.
      The closer to 0 <replaceable>parameter</replaceable> is, the flatter (more
      uniform) the access distribution.
      A crude approximation of the distribution is that the most frequent 1%
      values in the range, close to <replaceable>min</replaceable>, are drawn
      <replaceable>parameter</replaceable>% of the time.
      The <replaceable>parameter</replaceable> value must be strictly positive.
     </para>
    </listitem>

    <listitem>
     <para>
      For a Gaussian distribution, the interval is mapped onto a standard
      normal distribution (the classical bell-shaped Gaussian curve) truncated
      at <literal>-parameter</literal> on the left and <literal>+parameter</literal>
      on the right.
      Values in the middle of the interval are more likely to be drawn.
      To be precise, if <literal>PHI(x)</literal> is the cumulative distribution
      function of the standard normal distribution, with mean <literal>mu</literal>
      defined as <literal>(max + min) / 2.0</literal>, with
<literallayout>
f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
       (2.0 * PHI(parameter) - 1)
</literallayout>
      then value <replaceable>i</replaceable> between <replaceable>min</replaceable> and
      <replaceable>max</replaceable> inclusive is drawn with probability:
      <literal>f(i + 0.5) - f(i - 0.5)</literal>.
      Intuitively, the larger the <replaceable>parameter</replaceable>, the more
      frequently values close to the middle of the interval are drawn, and the
      less frequently values close to the <replaceable>min</replaceable> and
      <replaceable>max</replaceable> bounds. About 67% of values are drawn from the
      middle <literal>1.0 / parameter</literal>, that is a relative
      <literal>0.5 / parameter</literal> around the mean, and 95% in the middle
      <literal>2.0 / parameter</literal>, that is a relative
    

Title: pgbench Random Number Generation Functions: Zipfian, Square Root, and Distribution Details
Summary
This section details the `random_zipfian()` and `sqrt()` functions in pgbench, along with a deeper explanation of how the `random_exponential()`, `random_gaussian()` and `random_zipfian()` functions work. It describes uniform distribution and explains how the additional parameter in exponential, Gaussian, and Zipfian distributions affects the shape of the distribution, detailing the mathematical formulas and providing intuition on how to adjust the parameter for desired outcomes.