citext: A Case-Insensitive Character String Type

<sect1 id="citext" xreflabel="citext"> <title>citext — a case-insensitive character string type</title> <indexterm zone="citext"> <primary>citext</primary> </indexterm> <para> The <filename>citext</filename> module provides a case-insensitive character string type, <type>citext</type>. Essentially, it internally calls <function>lower</function> when comparing values. Otherwise, it behaves almost exactly like <type>text</type>. </para> <tip> <para> Consider using <firstterm>nondeterministic collations</firstterm> (see <xref linkend="collation-nondeterministic"/>) instead of this module. They can be used for case-insensitive comparisons, accent-insensitive comparisons, and other combinations, and they handle more Unicode special cases correctly. </para> </tip> <para> This module is considered <quote>trusted</quote>, that is, it can be installed by non-superusers who have <literal>CREATE</literal> privilege on the current database. </para> <sect2 id="citext-rationale"> <title>Rationale</title> <para> The standard approach to doing case-insensitive matches in <productname>PostgreSQL</productname> has been to use the <function>lower</function> function when comparing values, for example <programlisting> SELECT * FROM tab WHERE lower(col) = LOWER(?); </programlisting> </para> <para> This works reasonably well, but has a number of drawbacks: </para> <itemizedlist> <listitem> <para> It makes your SQL statements verbose, and you always have to remember to use <function>lower</function> on both the column and the query value. </para> </listitem> <listitem> <para> It won't use an index, unless you create a functional index using <function>lower</function>. </para> </listitem> <listitem> <para> If you declare a column as <literal>UNIQUE</literal> or <literal>PRIMARY KEY</literal>, the implicitly generated index is case-sensitive. So it's useless for case-insensitive searches, and it won't enforce uniqueness case-insensitively. </para> </listitem> </itemizedlist> <para> The <type>citext</type> data type allows you to eliminate calls to <function>lower</function> in SQL queries, and allows a primary key to be case-insensitive. <type>citext</type> is locale-aware, just like <type>text</type>, which means that the matching of upper case and lower case characters is dependent on the rules of the database's <literal>LC_CTYPE</literal> setting. Again, this behavior is identical to the use of <function>lower</function> in queries. But because it's done transparently by the data type, you don't have to remember to do anything special in your queries. </para> </sect2> <sect2 id="citext-how-to-use-it"> <title>How to Use It</title> <para> Here's a simple example of usage: <programlisting> CREATE TABLE users ( nick CITEXT PRIMARY KEY, pass TEXT NOT NULL ); INSERT INTO users VALUES ( 'larry', sha256(random()::text::bytea) ); INSERT INTO users VALUES ( 'Tom', sha256(random()::text::bytea) ); INSERT INTO users VALUES ( 'Damian', sha256(random()::text::bytea) ); INSERT INTO users VALUES ( 'NEAL', sha256(random()::text::bytea) ); INSERT INTO users VALUES ( 'Bjørn', sha256(random()::text::bytea) ); SELECT * FROM users WHERE nick = 'Larry'; </programlisting> The <command>SELECT</command> statement will return one tuple, even though the <structfield>nick</structfield> column was set to <literal>larry</literal> and the query was for <literal>Larry</literal>. </para> </sect2> <sect2 id="citext-string-comparison-behavior"> <title>String Comparison Behavior</title> <para> <type>citext</type> performs comparisons by converting each string to lower case (as though <function>lower</function> were called) and then comparing the results

The citext module provides a case-insensitive character string type for PostgreSQL, allowing for efficient and transparent case-insensitive comparisons, and is considered a trusted module that can be installed by non-superusers with CREATE privilege.