Home Explore Blog CI



postgresql

2nd chunk of `doc/src/sgml/geqo.sgml`
9f00c4ca34ae9c6176a92eff7ba08ff9bc35f524c0084d560000000100000fa0
 individual in the search space are represented
    by <firstterm>chromosomes</firstterm>, in essence a set of character
    strings. A <firstterm>gene</firstterm> is a
    subsection of a chromosome which encodes the value of a single parameter
    being optimized. Typical encodings for a gene could be <firstterm>binary</firstterm> or
    <firstterm>integer</firstterm>.
   </para>

   <para>
    Through simulation of the evolutionary operations <firstterm>recombination</firstterm>,
    <firstterm>mutation</firstterm>, and
    <firstterm>selection</firstterm> new generations of search points are found
    that show a higher average fitness than their ancestors. <xref linkend="geqo-figure"/>
    illustrates these steps.
   </para>

   <figure id="geqo-figure">
    <title>Structure of a Genetic Algorithm</title>
    <mediaobject>
     <imageobject>
      <imagedata fileref="images/genetic-algorithm.svg" format="SVG" width="100%"/>
     </imageobject>
    </mediaobject>
   </figure>

   <para>
    According to the <systemitem class="resource">comp.ai.genetic</systemitem> <acronym>FAQ</acronym> it cannot be stressed too
    strongly that a <acronym>GA</acronym> is not a pure random search for a solution to a
    problem. A <acronym>GA</acronym> uses stochastic processes, but the result is distinctly
    non-random (better than random).
   </para>

  </sect1>

  <sect1 id="geqo-pg-intro">
   <title>Genetic Query Optimization (<acronym>GEQO</acronym>) in PostgreSQL</title>

   <para>
    The <acronym>GEQO</acronym> module approaches the query
    optimization problem as though it were the well-known traveling salesman
    problem (<acronym>TSP</acronym>).
    Possible query plans are encoded as integer strings. Each string
    represents the join order from one relation of the query to the next.
    For example, the join tree
<literallayout class="monospaced">
   /\
  /\ 2
 /\ 3
4  1
</literallayout>
    is encoded by the integer string '4-1-3-2',
    which means, first join relation '4' and '1', then '3', and
    then '2', where 1, 2, 3, 4 are relation IDs within the
    <productname>PostgreSQL</productname> optimizer.
   </para>

   <para>
    Specific characteristics of the <acronym>GEQO</acronym>
    implementation in <productname>PostgreSQL</productname>
    are:

    <itemizedlist spacing="compact" mark="bullet">
     <listitem>
      <para>
       Usage of a <firstterm>steady state</firstterm> <acronym>GA</acronym> (replacement of the least fit
       individuals in a population, not whole-generational replacement)
       allows fast convergence towards improved query plans. This is
       essential for query handling with reasonable time;
      </para>
     </listitem>

     <listitem>
      <para>
       Usage of <firstterm>edge recombination crossover</firstterm>
       which is especially suited to keep edge losses low for the
       solution of the <acronym>TSP</acronym> by means of a
       <acronym>GA</acronym>;
      </para>
     </listitem>

     <listitem>
      <para>
       Mutation as genetic operator is deprecated so that no repair
       mechanisms are needed to generate legal <acronym>TSP</acronym> tours.
      </para>
     </listitem>
    </itemizedlist>
   </para>

   <para>
    Parts of the <acronym>GEQO</acronym> module are adapted from D. Whitley's
    Genitor algorithm.
   </para>

   <para>
    The <acronym>GEQO</acronym> module allows
    the <productname>PostgreSQL</productname> query optimizer to
    support large join queries effectively through
    non-exhaustive search.
   </para>

  <sect2 id="geqo-pg-intro-gen-possible-plans">
   <title>Generating Possible Plans with <acronym>GEQO</acronym></title>

   <para>
    The <acronym>GEQO</acronym> planning process uses the standard planner
    code to generate plans for scans of individual relations.  Then join
    plans are developed using the genetic approach.  As shown above, each
    candidate join plan is represented by a sequence in which to join

Title: Genetic Query Optimization in PostgreSQL
Summary
The Genetic Query Optimization (GEQO) module in PostgreSQL uses a genetic algorithm to efficiently optimize large join queries by representing possible query plans as integer strings and using evolutionary operations to find improved plans.