of more-fit candidates — that is, by using
randomly-chosen portions of known low-cost join sequences to create
new sequences for consideration. This process is repeated until a
preset number of join sequences have been considered; then the best
one found at any time during the search is used to generate the finished
plan.
</para>
<para>
This process is inherently nondeterministic, because of the randomized
choices made during both the initial population selection and subsequent
<quote>mutation</quote> of the best candidates. To avoid surprising changes
of the selected plan, each run of the GEQO algorithm restarts its
random number generator with the current <xref linkend="guc-geqo-seed"/>
parameter setting. As long as <varname>geqo_seed</varname> and the other
GEQO parameters are kept fixed, the same plan will be generated for a
given query (and other planner inputs such as statistics). To experiment
with different search paths, try changing <varname>geqo_seed</varname>.
</para>
</sect2>
<sect2 id="geqo-future">
<title>Future Implementation Tasks for
<productname>PostgreSQL</productname> <acronym>GEQO</acronym></title>
<para>
Work is still needed to improve the genetic algorithm parameter
settings.
In file <filename>src/backend/optimizer/geqo/geqo_main.c</filename>,
routines
<function>gimme_pool_size</function> and <function>gimme_number_generations</function>,
we have to find a compromise for the parameter settings
to satisfy two competing demands:
<itemizedlist spacing="compact">
<listitem>
<para>
Optimality of the query plan
</para>
</listitem>
<listitem>
<para>
Computing time
</para>
</listitem>
</itemizedlist>
</para>
<para>
In the current implementation, the fitness of each candidate join
sequence is estimated by running the standard planner's join selection
and cost estimation code from scratch. To the extent that different
candidates use similar sub-sequences of joins, a great deal of work
will be repeated. This could be made significantly faster by retaining
cost estimates for sub-joins. The problem is to avoid expending
unreasonable amounts of memory on retaining that state.
</para>
<para>
At a more basic level, it is not clear that solving query optimization
with a GA algorithm designed for TSP is appropriate. In the TSP case,
the cost associated with any substring (partial tour) is independent
of the rest of the tour, but this is certainly not true for query
optimization. Thus it is questionable whether edge recombination
crossover is the most effective mutation procedure.
</para>
</sect2>
</sect1>
<sect1 id="geqo-biblio">
<title>Further Reading</title>
<para>
The following resources contain additional information about
genetic algorithms:
<itemizedlist>
<listitem>
<para>
<ulink url="http://www.faqs.org/faqs/ai-faq/genetic/part1/">
The Hitch-Hiker's Guide to Evolutionary Computation</ulink>, (FAQ for <ulink
url="news://comp.ai.genetic"></ulink>)
</para>
</listitem>
<listitem>
<para>
<ulink url="https://www.red3d.com/cwr/evolve.html">
Evolutionary Computation and its application to art and design</ulink>, by
Craig Reynolds
</para>
</listitem>
<listitem>
<para>
<xref linkend="elma04"/>
</para>
</listitem>
<listitem>
<para>
<xref linkend="fong"/>
</para>
</listitem>
</itemizedlist>
</para>
</sect1>
</chapter>