Home Explore Blog CI



postgresql

23th chunk of `doc/src/sgml/charset.sgml`
3c4e603094ba26470592531eea9088f69260cf01eeb9de200000000100000fa5
 default is Latin before Greek.)
        </para>
       </listitem>
      </varlistentry>

      <varlistentry id="collation-managing-create-icu-en-u-kf-upper">
       <term><literal>CREATE COLLATION upperfirst (provider = icu, locale = 'en-u-kf-upper');</literal></term>
       <listitem>
        <para>
         Sort upper-case letters before lower-case letters.  (The default is
         lower-case letters first.)
        </para>
       </listitem>
      </varlistentry>

      <varlistentry id="collation-managing-create-icu-en-u-kf-upper-kr-grek-latn">
       <term><literal>CREATE COLLATION special (provider = icu, locale = 'en-u-kf-upper-kr-grek-latn');</literal></term>
       <listitem>
        <para>
         Combines both of the above options.
        </para>
       </listitem>
      </varlistentry>
     </variablelist>
   </sect3>

   <sect3 id="icu-tailoring-rules">
    <title>ICU Tailoring Rules</title>

    <para>
     If the options provided by the collation settings shown above are not
     sufficient, the order of collation elements can be changed with tailoring
     rules, whose syntax is detailed at <ulink
     url="https://unicode-org.github.io/icu/userguide/collation/customization/"></ulink>.
    </para>

    <para>
     This small example creates a collation based on the root locale with a
     tailoring rule:
<programlisting>
<![CDATA[CREATE COLLATION custom (provider = icu, locale = 'und', rules = '&V << w <<< W');]]>
</programlisting>
     With this rule, the letter <quote>W</quote> is sorted after
     <quote>V</quote>, but is treated as a secondary difference similar to an
     accent.  Rules like this are contained in the locale definitions of some
     languages.  (Of course, if a locale definition already contains the
     desired rules, then they don't need to be specified again explicitly.)
    </para>

    <para>
     Here is a more complex example.  The following statement sets up a
     collation named <literal>ebcdic</literal> with rules to sort US-ASCII
     characters in the order of the EBCDIC encoding.

<programlisting>
<![CDATA[CREATE COLLATION ebcdic (provider = icu, locale = 'und',
rules = $$
& ' ' < '.' < '<' < '(' < '+' < \|
< '&' < '!' < '$' < '*' < ')' < ';'
< '-' < '/' < ',' < '%' < '_' < '>' < '?'
< '`' < ':' < '#' < '@' < \' < '=' < '"'
<*a-r < '~' <*s-z < '^' < '[' < ']'
< '{' <*A-I < '}' <*J-R < '\' <*S-Z <*0-9
$$);]]>

SELECT c
FROM (VALUES ('a'), ('b'), ('A'), ('B'), ('1'), ('2'), ('!'), ('^')) AS x(c)
ORDER BY c COLLATE ebcdic;
 c
---
 !
 a
 b
 ^
 A
 B
 1
 2
</programlisting>
    </para>
   </sect3>

   <sect3 id="icu-external-references">
    <title>External References for ICU</title>

    <para>
     This section (<xref linkend="icu-custom-collations"/>) is only a brief
     overview of ICU behavior and language tags. Refer to the following
     documents for technical details, additional options, and new behavior:
    </para>

    <itemizedlist>
     <listitem>
      <para>
       <ulink url="https://www.unicode.org/reports/tr35/tr35-collation.html">Unicode Technical Standard #35</ulink>
      </para>
     </listitem>
     <listitem>
      <para>
       <ulink url="https://www.rfc-editor.org/info/bcp47">BCP 47</ulink>
      </para>
     </listitem>
     <listitem>
      <para>
       <ulink url="https://github.com/unicode-org/cldr/blob/master/common/bcp47/collation.xml">CLDR repository</ulink>
      </para>
     </listitem>
     <listitem>
      <para>
       <ulink url="https://unicode-org.github.io/icu/userguide/locale/"></ulink>
      </para>
     </listitem>
     <listitem>
      <para>
       <ulink url="https://unicode-org.github.io/icu/userguide/collation/"></ulink>
      </para>
     </listitem>
    </itemizedlist>
   </sect3>
  </sect2>
 </sect1>

 <sect1 id="multibyte">
  <title>Character Set Support</title>

  <indexterm zone="multibyte"><primary>character set</primary></indexterm>

  <para>
   The character set support in <productname>PostgreSQL</productname>
   allows

Title: ICU Tailoring Rules and External References
Summary
This section explains how to customize ICU collation using tailoring rules, which can change the order of collation elements, and provides examples of creating custom collations with specific rules, as well as references to external documents for further technical details on ICU behavior, language tags, and character set support.