Collation Derivation and Combination

letters, such as <function>lower</function>, <function>upper</function>, and <function>initcap</function>; by pattern matching operators; and by <function>to_char</function> and related functions. </para> <para> For a function or operator call, the collation that is derived by examining the argument collations is used at run time for performing the specified operation. If the result of the function or operator call is of a collatable data type, the collation is also used at parse time as the defined collation of the function or operator expression, in case there is a surrounding expression that requires knowledge of its collation. </para> <para> The <firstterm>collation derivation</firstterm> of an expression can be implicit or explicit. This distinction affects how collations are combined when multiple different collations appear in an expression. An explicit collation derivation occurs when a <literal>COLLATE</literal> clause is used; all other collation derivations are implicit. When multiple collations need to be combined, for example in a function call, the following rules are used: <orderedlist> <listitem> <para> If any input expression has an explicit collation derivation, then all explicitly derived collations among the input expressions must be the same, otherwise an error is raised. If any explicitly derived collation is present, that is the result of the collation combination. </para> </listitem> <listitem> <para> Otherwise, all input expressions must have the same implicit collation derivation or the default collation. If any non-default collation is present, that is the result of the collation combination. Otherwise, the result is the default collation. </para> </listitem> <listitem> <para> If there are conflicting non-default implicit collations among the input expressions, then the combination is deemed to have indeterminate collation. This is not an error condition unless the particular function being invoked requires knowledge of the collation it should apply. If it does, an error will be raised at run-time. </para> </listitem> </orderedlist> For example, consider this table definition: <programlisting> CREATE TABLE test1 ( a text COLLATE "de_DE", b text COLLATE "es_ES", ... ); </programlisting> Then in <programlisting> SELECT a < 'foo' FROM test1; </programlisting> the <literal><</literal> comparison is performed according to <literal>de_DE</literal> rules, because the expression combines an implicitly derived collation with the default collation. But in <programlisting> SELECT a < ('foo' COLLATE "fr_FR") FROM test1; </programlisting> the comparison is performed using <literal>fr_FR</literal> rules, because the explicit collation derivation overrides the implicit one. Furthermore, given <programlisting> SELECT a < b FROM test1; </programlisting> the parser cannot determine which collation to apply, since the <structfield>a</structfield> and <structfield>b</structfield> columns have conflicting implicit collations. Since the <literal><</literal> operator does need to know which collation to use, this will result in an error. The error can be resolved by attaching an explicit collation specifier to either input expression, thus: <programlisting> SELECT a < b COLLATE "de_DE" FROM test1; </programlisting> or equivalently <programlisting> SELECT a COLLATE "de_DE" < b FROM test1; </programlisting> On the other hand, the structurally similar case <programlisting> SELECT a || b FROM test1; </programlisting> does not result in an error, because the <literal>||</literal> operator does not care about collations: its result is the same regardless of

This section explains how collations are derived and combined in PostgreSQL, including the rules for implicit and explicit collation derivation, and how multiple collations are combined in expressions. It provides examples to illustrate the application of these rules, including cases where explicit collation specifiers are required to resolve conflicts between implicit collations.