<sect2 id="textsearch-manipulate-tsvector">
<title>Manipulating Documents</title>
<para>
<xref linkend="textsearch-parsing-documents"/> showed how raw textual
documents can be converted into <type>tsvector</type> values.
<productname>PostgreSQL</productname> also provides functions and
operators that can be used to manipulate documents that are already
in <type>tsvector</type> form.
</para>
<variablelist>
<varlistentry>
<term>
<indexterm>
<primary>tsvector concatenation</primary>
</indexterm>
<literal><type>tsvector</type> || <type>tsvector</type></literal>
</term>
<listitem>
<para>
The <type>tsvector</type> concatenation operator
returns a vector which combines the lexemes and positional information
of the two vectors given as arguments. Positions and weight labels
are retained during the concatenation.
Positions appearing in the right-hand vector are offset by the largest
position mentioned in the left-hand vector, so that the result is
nearly equivalent to the result of performing <function>to_tsvector</function>
on the concatenation of the two original document strings. (The
equivalence is not exact, because any stop-words removed from the
end of the left-hand argument will not affect the result, whereas
they would have affected the positions of the lexemes in the
right-hand argument if textual concatenation were used.)
</para>
<para>
One advantage of using concatenation in the vector form, rather than
concatenating text before applying <function>to_tsvector</function>, is that
you can use different configurations to parse different sections
of the document. Also, because the <function>setweight</function> function
marks all lexemes of the given vector the same way, it is necessary
to parse the text and do <function>setweight</function> before concatenating
if you want to label different parts of the document with different
weights.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<indexterm>
<primary>setweight</primary>
</indexterm>
<literal>setweight(<replaceable class="parameter">vector</replaceable> <type>tsvector</type>, <replaceable class="parameter">weight</replaceable> <type>"char"</type>) returns <type>tsvector</type></literal>
</term>
<listitem>
<para>
<function>setweight</function> returns a copy of the input vector in which every
position has been labeled with the given <replaceable>weight</replaceable>, either
<literal>A</literal>, <literal>B</literal>, <literal>C</literal>, or
<literal>D</literal>. (<literal>D</literal> is the default for new
vectors and as such is not displayed on output.) These labels are
retained when vectors are concatenated, allowing words from different
parts of a document to be weighted differently by ranking functions.
</para>
<para>
Note that weight labels apply to <emphasis>positions</emphasis>, not
<emphasis>lexemes</emphasis>. If the input vector has been stripped of
positions then <function>setweight</function> does nothing.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<indexterm>
<primary>length(tsvector)</primary>
</indexterm>
<literal>length(<replaceable class="parameter">vector</replaceable> <type>tsvector</type>) returns <type>integer</type></literal>
</term>
<listitem>
<para>
Returns the number of lexemes stored in the vector.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<indexterm>
<primary>strip</primary>
</indexterm>
<literal>strip(<replaceable class="parameter">vector</replaceable> <type>tsvector</type>) returns <type>tsvector</type></literal>