Query Processing and Connection Establishment in PostgreSQL

use the index. Next the cost for the execution of each path is estimated and the cheapest path is chosen. The cheapest path is expanded into a complete plan that the executor can use. </para> </step> <step> <para> The executor recursively steps through the <firstterm>plan tree</firstterm> and retrieves rows in the way represented by the plan. The executor makes use of the <firstterm>storage system</firstterm> while scanning relations, performs <firstterm>sorts</firstterm> and <firstterm>joins</firstterm>, evaluates <firstterm>qualifications</firstterm> and finally hands back the rows derived. </para> </step> </procedure> <para> In the following sections we will cover each of the above listed items in more detail to give a better understanding of <productname>PostgreSQL</productname>'s internal control and data structures. </para> </sect1> <sect1 id="connect-estab"> <title>How Connections Are Established</title> <para> <productname>PostgreSQL</productname> implements a <quote>process per user</quote> client/server model. In this model, every <glossterm linkend="glossary-client">client process</glossterm> connects to exactly one <glossterm linkend="glossary-backend">backend process</glossterm>. As we do not know ahead of time how many connections will be made, we have to use a <quote>supervisor process</quote> that spawns a new backend process every time a connection is requested. This supervisor process is called <glossterm linkend="glossary-postmaster">postmaster</glossterm> and listens at a specified TCP/IP port for incoming connections. Whenever it detects a request for a connection, it spawns a new backend process. Those backend processes communicate with each other and with other processes of the <glossterm linkend="glossary-instance">instance</glossterm> using <firstterm>semaphores</firstterm> and <glossterm linkend="glossary-shared-memory">shared memory</glossterm> to ensure data integrity throughout concurrent data access. </para> <para> The client process can be any program that understands the <productname>PostgreSQL</productname> protocol described in <xref linkend="protocol"/>. Many clients are based on the C-language library <application>libpq</application>, but several independent implementations of the protocol exist, such as the Java <application>JDBC</application> driver. </para> <para> Once a connection is established, the client process can send a query to the backend process it's connected to. The query is transmitted using plain text, i.e., there is no parsing done in the client. The backend process parses the query, creates an <firstterm>execution plan</firstterm>, executes the plan, and returns the retrieved rows to the client by transmitting them over the established connection. </para> </sect1> <sect1 id="parser-stage"> <title>The Parser Stage</title> <para> The <firstterm>parser stage</firstterm> consists of two parts: <itemizedlist> <listitem> <para> The <firstterm>parser</firstterm> defined in <filename>gram.y</filename> and <filename>scan.l</filename> is built using the Unix tools <application>bison</application> and <application>flex</application>. </para> </listitem> <listitem> <para> The <firstterm>transformation process</firstterm> does modifications and augmentations to the data structures returned by the parser. </para> </listitem> </itemizedlist> </para> <sect2 id="parser-stage-parser"> <title>Parser</title> <para> The parser has to check the query string (which arrives as plain text) for valid syntax. If the syntax is correct a <firstterm>parse tree</firstterm> is built up and handed back; otherwise

This section details the final steps of query processing, including executor operation and connection establishment in PostgreSQL. The executor recursively traverses the plan tree to retrieve rows, utilizing the storage system and performing sorts, joins, and qualifications. Following this, the process by which clients connect to the server using a 'process per user' model, managed by the postmaster is explained. It further elaborates on the parser stage, which consists of a parser built using bison and flex, and a transformation process.