<!-- doc/src/sgml/charset.sgml -->
<chapter id="charset">
<title>Localization</title>
<para>
This chapter describes the available localization features from the
point of view of the administrator.
<productname>PostgreSQL</productname> supports two localization
facilities:
<itemizedlist>
<listitem>
<para>
Using the locale features of the operating system to provide
locale-specific collation order, number formatting, translated
messages, and other aspects.
This is covered in <xref linkend="locale"/> and
<xref linkend="collation"/>.
</para>
</listitem>
<listitem>
<para>
Providing a number of different character sets to support storing text
in all kinds of languages, and providing character set translation
between client and server.
This is covered in <xref linkend="multibyte"/>.
</para>
</listitem>
</itemizedlist>
</para>
<sect1 id="locale">
<title>Locale Support</title>
<indexterm zone="locale"><primary>locale</primary></indexterm>
<para>
<firstterm>Locale</firstterm> support refers to an application respecting
cultural preferences regarding alphabets, sorting, number
formatting, etc. <productname>PostgreSQL</productname> uses the standard ISO
C and <acronym>POSIX</acronym> locale facilities provided by the server operating
system. For additional information refer to the documentation of your
system.
</para>
<sect2 id="locale-overview">
<title>Overview</title>
<para>
Locale support is automatically initialized when a database
cluster is created using <command>initdb</command>.
<command>initdb</command> will initialize the database cluster
with the locale setting of its execution environment by default,
so if your system is already set to use the locale that you want
in your database cluster then there is nothing else you need to
do. If you want to use a different locale (or you are not sure
which locale your system is set to), you can instruct
<command>initdb</command> exactly which locale to use by
specifying the <option>--locale</option> option. For example:
<screen>
initdb --locale=sv_SE
</screen>
</para>
<para>
This example for Unix systems sets the locale to Swedish
(<literal>sv</literal>) as spoken
in Sweden (<literal>SE</literal>). Other possibilities might include
<literal>en_US</literal> (U.S. English) and <literal>fr_CA</literal> (French
Canadian). If more than one character set can be used for a
locale then the specifications can take the form
<replaceable>language_territory.codeset</replaceable>. For example,
<literal>fr_BE.UTF-8</literal> represents the French language (fr) as
spoken in Belgium (BE), with a <acronym>UTF-8</acronym> character set
encoding.
</para>
<para>
What locales are available on your
system under what names depends on what was provided by the operating
system vendor and what was installed. On most Unix systems, the command
<literal>locale -a</literal> will provide a list of available locales.
Windows uses more verbose locale names, such as <literal>German_Germany</literal>
or <literal>Swedish_Sweden.1252</literal>, but the principles are the same.
</para>
<para>
Occasionally it is useful to mix rules from several locales, e.g.,
use English collation rules but Spanish messages. To support that, a
set of locale subcategories exist that control only certain
aspects of the localization rules:
<informaltable>
<tgroup cols="2">
<colspec colname="col1" colwidth="1*"/>
<colspec colname="col2" colwidth="3*"/>
<tbody>
<row>
<entry><envar>LC_COLLATE</envar></entry>
<entry>String sort order</entry>
</row>
<row>
<entry><envar>LC_CTYPE</envar></entry>
<entry>Character classification (What is a letter? Its upper-case equivalent?)</entry>