Predefined Collations and Locale Support

url="https://www.unicode.org/reports/tr18/#Compatibility_Properties">Compatibility Properties</ulink>. Behavior is efficient and stable within a <productname>Postgres</productname> major version. It is only available for encoding <literal>UTF8</literal>. </para> </listitem> </varlistentry> <varlistentry> <term><literal>pg_c_utf8</literal></term> <listitem> <para> This collation sorts by Unicode code point values rather than natural language order. For the functions <function>lower</function>, <function>initcap</function>, and <function>upper</function>, it uses Unicode simple case mapping. For pattern matching (including regular expressions), it uses the POSIX Compatible variant of Unicode <ulink url="https://www.unicode.org/reports/tr18/#Compatibility_Properties">Compatibility Properties</ulink>. Behavior is efficient and stable within a <productname>PostgreSQL</productname> major version. This collation is only available for encoding <literal>UTF8</literal>. </para> </listitem> </varlistentry> <varlistentry> <term><literal>C</literal> (equivalent to <literal>POSIX</literal>)</term> <listitem> <para> The <literal>C</literal> and <literal>POSIX</literal> collations are based on <quote>traditional C</quote> behavior. They sort by byte values rather than natural language order, and only the ASCII letters <quote><literal>A</literal></quote> through <quote><literal>Z</literal></quote> are treated as letters. The behavior is efficient and stable across all versions for a given database encoding, but behavior may vary between different database encodings. </para> </listitem> </varlistentry> <varlistentry> <term><literal>default</literal></term> <listitem> <para> The <literal>default</literal> collation selects the locale specified at database creation time. </para> </listitem> </varlistentry> </variablelist> </para> <para> Additional collations may be available depending on operating system support. The efficiency and stability of these additional collations depend on the collation provider, the provider version, and the locale. </para> </sect3> <sect3 id="collation-managing-predefined"> <title>Predefined Collations</title> <para> If the operating system provides support for using multiple locales within a single program (<function>newlocale</function> and related functions), or if support for ICU is configured, then when a database cluster is initialized, <command>initdb</command> populates the system catalog <literal>pg_collation</literal> with collations based on all the locales it finds in the operating system at the time. </para> <para> To inspect the currently available locales, use the query <literal>SELECT * FROM pg_collation</literal>, or the command <command>\dOS+</command> in <application>psql</application>. </para> <sect4 id="collation-managing-predefined-libc"> <title>libc Collations</title> <para> For example, the operating system might provide a locale named <literal>de_DE.utf8</literal>. <command>initdb</command> would then create a collation named <literal>de_DE.utf8</literal> for encoding <literal>UTF8</literal> that has both <symbol>LC_COLLATE</symbol> and <symbol>LC_CTYPE</symbol> set to <literal>de_DE.utf8</literal>. It will also create a collation with the <literal>.utf8</literal> tag stripped off the name. So you could also use the collation under the name <literal>de_DE</literal>, which is less cumbersome to write and makes the name less encoding-dependent. Note that, nevertheless, the initial set of collation names is platform-dependent.

This section describes how PostgreSQL populates its system catalog with predefined collations based on the locales found on the operating system during database cluster initialization. It also explains how to inspect available locales and collations, and discusses the creation of libc collations with specific names and encoding dependencies.