PostgreSQL Localization Features

<chapter id="charset"> <title>Localization</title> <para> This chapter describes the available localization features from the point of view of the administrator. <productname>PostgreSQL</productname> supports two localization facilities: <itemizedlist> <listitem> <para> Using the locale features of the operating system to provide locale-specific collation order, number formatting, translated messages, and other aspects. This is covered in <xref linkend="locale"/> and <xref linkend="collation"/>. </para> </listitem> <listitem> <para> Providing a number of different character sets to support storing text in all kinds of languages, and providing character set translation between client and server. This is covered in <xref linkend="multibyte"/>. </para> </listitem> </itemizedlist> </para> <sect1 id="locale"> <title>Locale Support</title> <indexterm zone="locale"><primary>locale</primary></indexterm> <para> <firstterm>Locale</firstterm> support refers to an application respecting cultural preferences regarding alphabets, sorting, number formatting, etc. <productname>PostgreSQL</productname> uses the standard ISO C and <acronym>POSIX</acronym> locale facilities provided by the server operating system. For additional information refer to the documentation of your system. </para> <sect2 id="locale-overview"> <title>Overview</title> <para> Locale support is automatically initialized when a database cluster is created using <command>initdb</command>. <command>initdb</command> will initialize the database cluster with the locale setting of its execution environment by default, so if your system is already set to use the locale that you want in your database cluster then there is nothing else you need to do. If you want to use a different locale (or you are not sure which locale your system is set to), you can instruct <command>initdb</command> exactly which locale to use by specifying the <option>--locale</option> option. For example: <screen> initdb --locale=sv_SE </screen> </para> <para> This example for Unix systems sets the locale to Swedish (<literal>sv</literal>) as spoken in Sweden (<literal>SE</literal>). Other possibilities might include <literal>en_US</literal> (U.S. English) and <literal>fr_CA</literal> (French Canadian). If more than one character set can be used for a locale then the specifications can take the form <replaceable>language_territory.codeset</replaceable>. For example, <literal>fr_BE.UTF-8</literal> represents the French language (fr) as spoken in Belgium (BE), with a <acronym>UTF-8</acronym> character set encoding. </para> <para> What locales are available on your system under what names depends on what was provided by the operating system vendor and what was installed. On most Unix systems, the command <literal>locale -a</literal> will provide a list of available locales. Windows uses more verbose locale names, such as <literal>German_Germany</literal> or <literal>Swedish_Sweden.1252</literal>, but the principles are the same. </para> <para> Occasionally it is useful to mix rules from several locales, e.g., use English collation rules but Spanish messages. To support that, a set of locale subcategories exist that control only certain aspects of the localization rules: <informaltable> <tgroup cols="2"> <colspec colname="col1" colwidth="1*"/> <colspec colname="col2" colwidth="3*"/> <tbody> <row> <entry><envar>LC_COLLATE</envar></entry> <entry>String sort order</entry> </row> <row> <entry><envar>LC_CTYPE</envar></entry> <entry>Character classification (What is a letter? Its upper-case equivalent?)</entry>

This chapter describes PostgreSQL's localization features, including locale support and character set translation, allowing administrators to configure the database to respect cultural preferences for languages, sorting, and formatting.