Home Explore Blog CI



postgresql

4th chunk of `doc/src/sgml/citext.sgml`
58b5b9fa351fb7b5a02b7eda0f5e9daf5600ff502f76896f0000000100000cda
 database, users
      of one language may find their query results are not as expected if the
      collation is for another language.
     </para>
    </listitem>

    <listitem>
     <para>
      As of <productname>PostgreSQL</productname> 9.1, you can attach a
      <literal>COLLATE</literal> specification to <type>citext</type> columns or data
      values.  Currently, <type>citext</type> operators will honor a non-default
      <literal>COLLATE</literal> specification while comparing case-folded strings,
      but the initial folding to lower case is always done according to the
      database's <literal>LC_CTYPE</literal> setting (that is, as though
      <literal>COLLATE "default"</literal> were given).  This may be changed in a
      future release so that both steps follow the input <literal>COLLATE</literal>
      specification.
     </para>
    </listitem>

    <listitem>
     <para>
       <type>citext</type> is not as efficient as <type>text</type> because the
       operator functions and the B-tree comparison functions must make copies
       of the data and convert it to lower case for comparisons.  Also, only
       <type>text</type> can support B-Tree deduplication.  However,
       <type>citext</type> is slightly more efficient than using
       <function>lower</function> to get case-insensitive matching.
     </para>
    </listitem>

    <listitem>
     <para>
      <type>citext</type> doesn't help much if you need data to compare
      case-sensitively in some contexts and case-insensitively in other
      contexts.  The standard answer is to use the <type>text</type> type and
      manually use the <function>lower</function> function when you need to compare
      case-insensitively; this works all right if case-insensitive comparison
      is needed only infrequently.  If you need case-insensitive behavior most
      of the time and case-sensitive infrequently, consider storing the data
      as <type>citext</type> and explicitly casting the column to <type>text</type>
      when you want case-sensitive comparison.  In either situation, you will
      need two indexes if you want both types of searches to be fast.
    </para>
    </listitem>

    <listitem>
     <para>
      The schema containing the <type>citext</type> operators must be
      in the current <varname>search_path</varname> (typically <literal>public</literal>);
      if it is not, the normal case-sensitive <type>text</type> operators
      will be invoked instead.
    </para>
    </listitem>

    <listitem>
     <para>
      The approach of lower-casing strings for comparison does not handle some
      Unicode special cases correctly, for example when one upper-case letter
      has two lower-case letter equivalents.  Unicode distinguishes between
      <firstterm>case mapping</firstterm> and <firstterm>case
      folding</firstterm> for this reason.  Use nondeterministic collations
      instead of <type>citext</type> to handle that correctly.
     </para>
    </listitem>
   </itemizedlist>
 </sect2>

 <sect2 id="citext-author">
  <title>Author</title>

  <para>
   David E. Wheeler <email>david@kineticode.com</email>
  </para>

  <para>
    Inspired by the original <type>citext</type> module by Donald Fraser.
  </para>

 </sect2>

</sect1>

Title: citext Limitations and Usage Considerations
Summary
The citext data type has several limitations, including efficiency issues, lack of support for Unicode special cases, and the need for careful consideration of case-sensitive and case-insensitive comparisons, and its usage may require additional indexes and manual casting to achieve desired behavior.