individual in the search space are represented
by <firstterm>chromosomes</firstterm>, in essence a set of character
strings. A <firstterm>gene</firstterm> is a
subsection of a chromosome which encodes the value of a single parameter
being optimized. Typical encodings for a gene could be <firstterm>binary</firstterm> or
<firstterm>integer</firstterm>.
</para>
<para>
Through simulation of the evolutionary operations <firstterm>recombination</firstterm>,
<firstterm>mutation</firstterm>, and
<firstterm>selection</firstterm> new generations of search points are found
that show a higher average fitness than their ancestors. <xref linkend="geqo-figure"/>
illustrates these steps.
</para>
<figure id="geqo-figure">
<title>Structure of a Genetic Algorithm</title>
<mediaobject>
<imageobject>
<imagedata fileref="images/genetic-algorithm.svg" format="SVG" width="100%"/>
</imageobject>
</mediaobject>
</figure>
<para>
According to the <systemitem class="resource">comp.ai.genetic</systemitem> <acronym>FAQ</acronym> it cannot be stressed too
strongly that a <acronym>GA</acronym> is not a pure random search for a solution to a
problem. A <acronym>GA</acronym> uses stochastic processes, but the result is distinctly
non-random (better than random).
</para>
</sect1>
<sect1 id="geqo-pg-intro">
<title>Genetic Query Optimization (<acronym>GEQO</acronym>) in PostgreSQL</title>
<para>
The <acronym>GEQO</acronym> module approaches the query
optimization problem as though it were the well-known traveling salesman
problem (<acronym>TSP</acronym>).
Possible query plans are encoded as integer strings. Each string
represents the join order from one relation of the query to the next.
For example, the join tree
<literallayout class="monospaced">
/\
/\ 2
/\ 3
4 1
</literallayout>
is encoded by the integer string '4-1-3-2',
which means, first join relation '4' and '1', then '3', and
then '2', where 1, 2, 3, 4 are relation IDs within the
<productname>PostgreSQL</productname> optimizer.
</para>
<para>
Specific characteristics of the <acronym>GEQO</acronym>
implementation in <productname>PostgreSQL</productname>
are:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
Usage of a <firstterm>steady state</firstterm> <acronym>GA</acronym> (replacement of the least fit
individuals in a population, not whole-generational replacement)
allows fast convergence towards improved query plans. This is
essential for query handling with reasonable time;
</para>
</listitem>
<listitem>
<para>
Usage of <firstterm>edge recombination crossover</firstterm>
which is especially suited to keep edge losses low for the
solution of the <acronym>TSP</acronym> by means of a
<acronym>GA</acronym>;
</para>
</listitem>
<listitem>
<para>
Mutation as genetic operator is deprecated so that no repair
mechanisms are needed to generate legal <acronym>TSP</acronym> tours.
</para>
</listitem>
</itemizedlist>
</para>
<para>
Parts of the <acronym>GEQO</acronym> module are adapted from D. Whitley's
Genitor algorithm.
</para>
<para>
The <acronym>GEQO</acronym> module allows
the <productname>PostgreSQL</productname> query optimizer to
support large join queries effectively through
non-exhaustive search.
</para>
<sect2 id="geqo-pg-intro-gen-possible-plans">
<title>Generating Possible Plans with <acronym>GEQO</acronym></title>
<para>
The <acronym>GEQO</acronym> planning process uses the standard planner
code to generate plans for scans of individual relations. Then join
plans are developed using the genetic approach. As shown above, each
candidate join plan is represented by a sequence in which to join