citext Limitations and Usage Considerations

database, users of one language may find their query results are not as expected if the collation is for another language. </para> </listitem> <listitem> <para> As of <productname>PostgreSQL</productname> 9.1, you can attach a <literal>COLLATE</literal> specification to <type>citext</type> columns or data values. Currently, <type>citext</type> operators will honor a non-default <literal>COLLATE</literal> specification while comparing case-folded strings, but the initial folding to lower case is always done according to the database's <literal>LC_CTYPE</literal> setting (that is, as though <literal>COLLATE "default"</literal> were given). This may be changed in a future release so that both steps follow the input <literal>COLLATE</literal> specification. </para> </listitem> <listitem> <para> <type>citext</type> is not as efficient as <type>text</type> because the operator functions and the B-tree comparison functions must make copies of the data and convert it to lower case for comparisons. Also, only <type>text</type> can support B-Tree deduplication. However, <type>citext</type> is slightly more efficient than using <function>lower</function> to get case-insensitive matching. </para> </listitem> <listitem> <para> <type>citext</type> doesn't help much if you need data to compare case-sensitively in some contexts and case-insensitively in other contexts. The standard answer is to use the <type>text</type> type and manually use the <function>lower</function> function when you need to compare case-insensitively; this works all right if case-insensitive comparison is needed only infrequently. If you need case-insensitive behavior most of the time and case-sensitive infrequently, consider storing the data as <type>citext</type> and explicitly casting the column to <type>text</type> when you want case-sensitive comparison. In either situation, you will need two indexes if you want both types of searches to be fast. </para> </listitem> <listitem> <para> The schema containing the <type>citext</type> operators must be in the current <varname>search_path</varname> (typically <literal>public</literal>); if it is not, the normal case-sensitive <type>text</type> operators will be invoked instead. </para> </listitem> <listitem> <para> The approach of lower-casing strings for comparison does not handle some Unicode special cases correctly, for example when one upper-case letter has two lower-case letter equivalents. Unicode distinguishes between <firstterm>case mapping</firstterm> and <firstterm>case folding</firstterm> for this reason. Use nondeterministic collations instead of <type>citext</type> to handle that correctly. </para> </listitem> </itemizedlist> </sect2> <sect2 id="citext-author"> <title>Author</title> <para> David E. Wheeler <email>david@kineticode.com</email> </para> <para> Inspired by the original <type>citext</type> module by Donald Fraser. </para> </sect2> </sect1>

The citext data type has several limitations, including efficiency issues, lack of support for Unicode special cases, and the need for careful consideration of case-sensitive and case-insensitive comparisons, and its usage may require additional indexes and manual casting to achieve desired behavior.