default is Latin before Greek.)
</para>
</listitem>
</varlistentry>
<varlistentry id="collation-managing-create-icu-en-u-kf-upper">
<term><literal>CREATE COLLATION upperfirst (provider = icu, locale = 'en-u-kf-upper');</literal></term>
<listitem>
<para>
Sort upper-case letters before lower-case letters. (The default is
lower-case letters first.)
</para>
</listitem>
</varlistentry>
<varlistentry id="collation-managing-create-icu-en-u-kf-upper-kr-grek-latn">
<term><literal>CREATE COLLATION special (provider = icu, locale = 'en-u-kf-upper-kr-grek-latn');</literal></term>
<listitem>
<para>
Combines both of the above options.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect3>
<sect3 id="icu-tailoring-rules">
<title>ICU Tailoring Rules</title>
<para>
If the options provided by the collation settings shown above are not
sufficient, the order of collation elements can be changed with tailoring
rules, whose syntax is detailed at <ulink
url="https://unicode-org.github.io/icu/userguide/collation/customization/"></ulink>.
</para>
<para>
This small example creates a collation based on the root locale with a
tailoring rule:
<programlisting>
<![CDATA[CREATE COLLATION custom (provider = icu, locale = 'und', rules = '&V << w <<< W');]]>
</programlisting>
With this rule, the letter <quote>W</quote> is sorted after
<quote>V</quote>, but is treated as a secondary difference similar to an
accent. Rules like this are contained in the locale definitions of some
languages. (Of course, if a locale definition already contains the
desired rules, then they don't need to be specified again explicitly.)
</para>
<para>
Here is a more complex example. The following statement sets up a
collation named <literal>ebcdic</literal> with rules to sort US-ASCII
characters in the order of the EBCDIC encoding.
<programlisting>
<![CDATA[CREATE COLLATION ebcdic (provider = icu, locale = 'und',
rules = $$
& ' ' < '.' < '<' < '(' < '+' < \|
< '&' < '!' < '$' < '*' < ')' < ';'
< '-' < '/' < ',' < '%' < '_' < '>' < '?'
< '`' < ':' < '#' < '@' < \' < '=' < '"'
<*a-r < '~' <*s-z < '^' < '[' < ']'
< '{' <*A-I < '}' <*J-R < '\' <*S-Z <*0-9
$$);]]>
SELECT c
FROM (VALUES ('a'), ('b'), ('A'), ('B'), ('1'), ('2'), ('!'), ('^')) AS x(c)
ORDER BY c COLLATE ebcdic;
c
---
!
a
b
^
A
B
1
2
</programlisting>
</para>
</sect3>
<sect3 id="icu-external-references">
<title>External References for ICU</title>
<para>
This section (<xref linkend="icu-custom-collations"/>) is only a brief
overview of ICU behavior and language tags. Refer to the following
documents for technical details, additional options, and new behavior:
</para>
<itemizedlist>
<listitem>
<para>
<ulink url="https://www.unicode.org/reports/tr35/tr35-collation.html">Unicode Technical Standard #35</ulink>
</para>
</listitem>
<listitem>
<para>
<ulink url="https://www.rfc-editor.org/info/bcp47">BCP 47</ulink>
</para>
</listitem>
<listitem>
<para>
<ulink url="https://github.com/unicode-org/cldr/blob/master/common/bcp47/collation.xml">CLDR repository</ulink>
</para>
</listitem>
<listitem>
<para>
<ulink url="https://unicode-org.github.io/icu/userguide/locale/"></ulink>
</para>
</listitem>
<listitem>
<para>
<ulink url="https://unicode-org.github.io/icu/userguide/collation/"></ulink>
</para>
</listitem>
</itemizedlist>
</sect3>
</sect2>
</sect1>
<sect1 id="multibyte">
<title>Character Set Support</title>
<indexterm zone="multibyte"><primary>character set</primary></indexterm>
<para>
The character set support in <productname>PostgreSQL</productname>
allows