Home Explore Blog CI



postgresql

19th chunk of `doc/src/sgml/textsearch.sgml`
b78af8afa093ad112be7feef35a6fd4181ae27f42d9752a80000000100000fa0
 provide an effective defense against attacks such as cross-site
      scripting (XSS) attacks, when working with untrusted input. To guard
      against such attacks, all HTML markup should be removed from the input
      document, or an HTML sanitizer should be used on the output.
     </para>
    </warning>

    These option names are recognized case-insensitively.
    You must double-quote string values if they contain spaces or commas.
   </para>

   <para>
    In non-fragment-based headline
    generation, <function>ts_headline</function> locates matches for the
    given <replaceable class="parameter">query</replaceable> and chooses a
    single one to display, preferring matches that have more query words
    within the allowed headline length.
    In fragment-based headline generation, <function>ts_headline</function>
    locates the query matches and splits each match
    into <quote>fragments</quote> of no more than <literal>MaxWords</literal>
    words each, preferring fragments with more query words, and when
    possible <quote>stretching</quote> fragments to include surrounding
    words.  The fragment-based mode is thus more useful when the query
    matches span large sections of the document, or when it's desirable to
    display multiple matches.
    In either mode, if no query matches can be identified, then a single
    fragment of the first <literal>MinWords</literal> words in the document
    will be displayed.
   </para>

   <para>
    For example:

<screen>
SELECT ts_headline('english',
  'The most common type of search
is to find all documents containing given query terms
and return them in order of their similarity to the
query.',
  to_tsquery('english', 'query &amp; similarity'));
                        ts_headline
------------------------------------------------------------
 containing given &lt;b&gt;query&lt;/b&gt; terms                       +
 and return them in order of their &lt;b&gt;similarity&lt;/b&gt; to the+
 &lt;b&gt;query&lt;/b&gt;.

SELECT ts_headline('english',
  'Search terms may occur
many times in a document,
requiring ranking of the search matches to decide which
occurrences to display in the result.',
  to_tsquery('english', 'search &amp; term'),
  'MaxFragments=10, MaxWords=7, MinWords=3, StartSel=&lt;&lt;, StopSel=&gt;&gt;');
                        ts_headline
------------------------------------------------------------
 &lt;&lt;Search&gt;&gt; &lt;&lt;terms&gt;&gt; may occur                            +
 many times ... ranking of the &lt;&lt;search&gt;&gt; matches to decide
</screen>
   </para>

   <para>
    <function>ts_headline</function> uses the original document, not a
    <type>tsvector</type> summary, so it can be slow and should be used with
    care.
   </para>

  </sect2>

 </sect1>

 <sect1 id="textsearch-features">
  <title>Additional Features</title>

  <para>
   This section describes additional functions and operators that are
   useful in connection with text search.
  </para>

  <sect2 id="textsearch-manipulate-tsvector">
   <title>Manipulating Documents</title>

   <para>
    <xref linkend="textsearch-parsing-documents"/> showed how raw textual
    documents can be converted into <type>tsvector</type> values.
    <productname>PostgreSQL</productname> also provides functions and
    operators that can be used to manipulate documents that are already
    in <type>tsvector</type> form.
   </para>

   <variablelist>

    <varlistentry>

     <term>
     <indexterm>
      <primary>tsvector concatenation</primary>
     </indexterm>

      <literal><type>tsvector</type> || <type>tsvector</type></literal>
     </term>

     <listitem>
      <para>
       The <type>tsvector</type> concatenation operator
       returns a vector which combines the lexemes and positional information
       of the two vectors given as arguments.  Positions and weight labels
       are retained during the concatenation.
       Positions appearing in the right-hand vector are offset by the

Title: ts_headline Generation and Additional Text Search Features
Summary
The ts_headline function generates headlines using either non-fragment-based or fragment-based approaches, preferring matches with more query words. It also includes example usages of the ts_headline function. Additionally, PostgreSQL offers functions and operators for manipulating tsvector documents, including tsvector concatenation, which combines lexemes and positional information of two vectors.