Home Explore Blog CI



postgresql

13th chunk of `doc/src/sgml/charset.sgml`
532996f5971c8579af60da4dbdb0ef6204b958b2e0b8cb270000000100000fa1
 url="https://www.unicode.org/reports/tr18/#Compatibility_Properties">Compatibility
        Properties</ulink>.  Behavior is efficient and stable within a
        <productname>Postgres</productname> major version.  It is only
        available for encoding <literal>UTF8</literal>.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><literal>pg_c_utf8</literal></term>
      <listitem>
       <para>
        This collation sorts by Unicode code point values rather than natural
        language order.  For the functions <function>lower</function>,
        <function>initcap</function>, and <function>upper</function>, it uses
        Unicode simple case mapping.  For pattern matching (including regular
        expressions), it uses the POSIX Compatible variant of Unicode <ulink
        url="https://www.unicode.org/reports/tr18/#Compatibility_Properties">Compatibility
        Properties</ulink>.  Behavior is efficient and stable within a
        <productname>PostgreSQL</productname> major version.  This collation is
        only available for encoding <literal>UTF8</literal>.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><literal>C</literal> (equivalent to <literal>POSIX</literal>)</term>
      <listitem>
       <para>
        The <literal>C</literal> and <literal>POSIX</literal> collations are
        based on <quote>traditional C</quote> behavior.  They sort by byte
        values rather than natural language order, and only the ASCII letters
        <quote><literal>A</literal></quote> through
        <quote><literal>Z</literal></quote> are treated as letters.  The
        behavior is efficient and stable across all versions for a given
        database encoding, but behavior may vary between different database
        encodings.
       </para>
      </listitem>
     </varlistentry>

     <varlistentry>
      <term><literal>default</literal></term>
      <listitem>
       <para>
        The <literal>default</literal> collation selects the locale specified
        at database creation time.
       </para>
      </listitem>
     </varlistentry>
    </variablelist>
   </para>

   <para>
    Additional collations may be available depending on operating system
    support.  The efficiency and stability of these additional collations
    depend on the collation provider, the provider version, and the locale.
   </para>
  </sect3>

  <sect3 id="collation-managing-predefined">
   <title>Predefined Collations</title>

   <para>
    If the operating system provides support for using multiple locales
    within a single program (<function>newlocale</function> and related functions),
    or if support for ICU is configured,
    then when a database cluster is initialized, <command>initdb</command>
    populates the system catalog <literal>pg_collation</literal> with
    collations based on all the locales it finds in the operating
    system at the time.
   </para>

   <para>
    To inspect the currently available locales, use the query <literal>SELECT
    * FROM pg_collation</literal>, or the command <command>\dOS+</command>
    in <application>psql</application>.
   </para>

  <sect4 id="collation-managing-predefined-libc">
   <title>libc Collations</title>

   <para>
    For example, the operating system might
    provide a locale named <literal>de_DE.utf8</literal>.
    <command>initdb</command> would then create a collation named
    <literal>de_DE.utf8</literal> for encoding <literal>UTF8</literal>
    that has both <symbol>LC_COLLATE</symbol> and
    <symbol>LC_CTYPE</symbol> set to <literal>de_DE.utf8</literal>.
    It will also create a collation with the <literal>.utf8</literal>
    tag stripped off the name.  So you could also use the collation
    under the name <literal>de_DE</literal>, which is less cumbersome
    to write and makes the name less encoding-dependent.  Note that,
    nevertheless, the initial set of collation names is
    platform-dependent.

Title: Predefined Collations and Locale Support
Summary
This section describes how PostgreSQL populates its system catalog with predefined collations based on the locales found on the operating system during database cluster initialization. It also explains how to inspect available locales and collations, and discusses the creation of libc collations with specific names and encoding dependencies.