right-table match, empty (null)
values are substituted for the right-table columns.
</para>
<formalpara>
<title>Exercise:</title>
<para>
There are also right outer joins and full outer joins. Try to
find out what those do.
</para>
</formalpara>
<indexterm><primary>join</primary><secondary>self</secondary></indexterm>
<indexterm><primary>alias</primary><secondary>for table name in query</secondary></indexterm>
<para>
We can also join a table against itself. This is called a
<firstterm>self join</firstterm>. As an example, suppose we wish
to find all the weather records that are in the temperature range
of other weather records. So we need to compare the
<structfield>temp_lo</structfield> and <structfield>temp_hi</structfield> columns of
each <structname>weather</structname> row to the
<structfield>temp_lo</structfield> and
<structfield>temp_hi</structfield> columns of all other
<structname>weather</structname> rows. We can do this with the
following query:
<programlisting>
SELECT w1.city, w1.temp_lo AS low, w1.temp_hi AS high,
w2.city, w2.temp_lo AS low, w2.temp_hi AS high
FROM weather w1 JOIN weather w2
ON w1.temp_lo < w2.temp_lo AND w1.temp_hi > w2.temp_hi;
</programlisting>
<screen>
city | low | high | city | low | high
---------------+-----+------+---------------+-----+------
San Francisco | 43 | 57 | San Francisco | 46 | 50
Hayward | 37 | 54 | San Francisco | 46 | 50
(2 rows)
</screen>
Here we have relabeled the weather table as <literal>w1</literal> and
<literal>w2</literal> to be able to distinguish the left and right side
of the join. You can also use these kinds of aliases in other
queries to save some typing, e.g.:
<programlisting>
SELECT *
FROM weather w JOIN cities c ON w.city = c.name;
</programlisting>
You will encounter this style of abbreviating quite frequently.
</para>
</sect1>
<sect1 id="tutorial-agg">
<title>Aggregate Functions</title>
<indexterm zone="tutorial-agg">
<primary>aggregate function</primary>
</indexterm>
<para>
Like most other relational database products,
<productname>PostgreSQL</productname> supports
<firstterm>aggregate functions</firstterm>.
An aggregate function computes a single result from multiple input rows.
For example, there are aggregates to compute the
<function>count</function>, <function>sum</function>,
<function>avg</function> (average), <function>max</function> (maximum) and
<function>min</function> (minimum) over a set of rows.
</para>
<para>
As an example, we can find the highest low-temperature reading anywhere
with:
<programlisting>
SELECT max(temp_lo) FROM weather;
</programlisting>
<screen>
max
-----
46
(1 row)
</screen>
</para>
<para>
<indexterm><primary>subquery</primary></indexterm>
If we wanted to know what city (or cities) that reading occurred in,
we might try:
<programlisting>
SELECT city FROM weather WHERE temp_lo = max(temp_lo); <lineannotation>-- WRONG</lineannotation>
</programlisting>
but this will not work since the aggregate
<function>max</function> cannot be used in the
<literal>WHERE</literal> clause. (This restriction exists because
the <literal>WHERE</literal> clause determines which rows will be
included in the aggregate calculation; so obviously it has to be evaluated
before aggregate functions are computed.)
However, as is often the case
the query can be restated to accomplish the desired result, here
by using a <firstterm>subquery</firstterm>:
<programlisting>
SELECT city FROM weather
WHERE temp_lo = (SELECT max(temp_lo) FROM weather);
</programlisting>
<screen>
city
---------------
San Francisco
(1 row)
</screen>
This is OK because the subquery is an independent computation
that computes its own aggregate separately