Home Explore Blog CI



postgresql

3rd chunk of `doc/src/sgml/arch-dev.sgml`
bd09aca18892f138d76faa4d2759ea83d6342069fc0478260000000100000fa5

   </para>
  </sect1>

  <sect1 id="parser-stage">
   <title>The Parser Stage</title>

   <para>
    The <firstterm>parser stage</firstterm> consists of two parts:

    <itemizedlist>
     <listitem>
      <para>
       The <firstterm>parser</firstterm> defined in
       <filename>gram.y</filename> and <filename>scan.l</filename> is
       built using the Unix tools <application>bison</application>
       and <application>flex</application>.
      </para>
     </listitem>
     <listitem>
      <para>
       The <firstterm>transformation process</firstterm> does
       modifications and augmentations to the data structures returned by the parser.
      </para>
     </listitem>
    </itemizedlist>
   </para>

   <sect2 id="parser-stage-parser">
    <title>Parser</title>

    <para>
     The parser has to check the query string (which arrives as plain
     text) for valid syntax. If the syntax is correct a
     <firstterm>parse tree</firstterm> is built up and handed back;
     otherwise an error is returned. The parser and lexer are
     implemented using the well-known Unix tools <application>bison</application>
     and <application>flex</application>.
    </para>

    <para>
     The <firstterm>lexer</firstterm> is defined in the file
     <filename>scan.l</filename> and is responsible
     for recognizing <firstterm>identifiers</firstterm>,
     the <firstterm>SQL key words</firstterm> etc. For
     every key word or identifier that is found, a <firstterm>token</firstterm>
     is generated and handed to the parser.
    </para>

    <para>
     The parser is defined in the file <filename>gram.y</filename> and
     consists of a set of <firstterm>grammar rules</firstterm> and
     <firstterm>actions</firstterm> that are executed whenever a rule
     is fired. The code of the actions (which is actually C code) is
     used to build up the parse tree.
    </para>

    <para>
     The file <filename>scan.l</filename> is transformed to the C
     source file <filename>scan.c</filename> using the program
     <application>flex</application> and <filename>gram.y</filename> is
     transformed to <filename>gram.c</filename> using
     <application>bison</application>.  After these transformations
     have taken place a normal C compiler can be used to create the
     parser. Never make any changes to the generated C files as they
     will be overwritten the next time <application>flex</application>
     or <application>bison</application> is called.

     <note>
      <para>
       The mentioned transformations and compilations are normally done
       automatically using the <firstterm>makefiles</firstterm>
       shipped with the <productname>PostgreSQL</productname>
       source distribution.
      </para>
     </note>
    </para>

    <para>
     A detailed description of <application>bison</application> or
     the grammar rules given in <filename>gram.y</filename> would be
     beyond the scope of this manual. There are many books and
     documents dealing with <application>flex</application> and
     <application>bison</application>. You should be familiar with
     <application>bison</application> before you start to study the
     grammar given in <filename>gram.y</filename> otherwise you won't
     understand what happens there.
    </para>

   </sect2>

   <sect2 id="parser-stage-transformation-process">
     <title>Transformation Process</title>

    <para>
     The parser stage creates a parse tree using only fixed rules about
     the syntactic structure of SQL.  It does not make any lookups in the
     system catalogs, so there is no possibility to understand the detailed
     semantics of the requested operations.  After the parser completes,
     the <firstterm>transformation process</firstterm> takes the tree handed
     back by the parser as input and does the semantic interpretation needed
     to understand which tables, functions, and operators are referenced by
     the query.  The data structure that is built

Title: Detailed Breakdown of the Parser Stage in PostgreSQL
Summary
This section provides a detailed explanation of the parser stage in PostgreSQL, highlighting its two primary components: the parser and the transformation process. The parser, constructed using bison and flex, checks the query string for correct syntax and builds a parse tree if valid. The transformation process then takes this parse tree and performs semantic interpretation, identifying the tables, functions, and operators referenced in the query. The roles of the lexer and grammar rules are also discussed.