Using the xpath_table Function

</table> <para> These parameters (except the XPath strings) are just substituted into a plain SQL SELECT statement, so you have some flexibility — the statement is </para> <para> <literal> SELECT <key>, <document> FROM <relation> WHERE <criteria> </literal> </para> <para> so those parameters can be <emphasis>anything</emphasis> valid in those particular locations. The result from this SELECT needs to return exactly two columns (which it will unless you try to list multiple fields for key or document). Beware that this simplistic approach requires that you validate any user-supplied values to avoid SQL injection attacks. </para> <para> The function has to be used in a <literal>FROM</literal> expression, with an <literal>AS</literal> clause to specify the output columns; for example <programlisting> SELECT * FROM xpath_table('article_id', 'article_xml', 'articles', '/article/author|/article/pages|/article/title', 'date_entered > ''2003-01-01'' ') AS t(article_id integer, author text, page_count integer, title text); </programlisting> The <literal>AS</literal> clause defines the names and types of the columns in the output table. The first is the <quote>key</quote> field and the rest correspond to the XPath queries. If there are more XPath queries than result columns, the extra queries will be ignored. If there are more result columns than XPath queries, the extra columns will be NULL. </para> <para> Notice that this example defines the <structname>page_count</structname> result column as an integer. The function deals internally with string representations, so when you say you want an integer in the output, it will take the string representation of the XPath result and use PostgreSQL input functions to transform it into an integer (or whatever type the <type>AS</type> clause requests). An error will result if it can't do this — for example if the result is empty — so you may wish to just stick to <type>text</type> as the column type if you think your data has any problems. </para> <para> The calling <command>SELECT</command> statement doesn't necessarily have to be just <literal>SELECT *</literal> — it can reference the output columns by name or join them to other tables. The function produces a virtual table with which you can perform any operation you wish (e.g., aggregation, joining, sorting etc.). So we could also have: <programlisting> SELECT t.title, p.fullname, p.email FROM xpath_table('article_id', 'article_xml', 'articles', '/article/title|/article/author/@id', 'xpath_string(article_xml,''/article/@date'') > ''2003-03-20'' ') AS t(article_id integer, title text, author_id integer), tblPeopleInfo AS p WHERE t.author_id = p.person_id; </programlisting> as a more complicated example. Of course, you could wrap all of this in a view for convenience. </para> <sect3 id="xml2-xpath-table-multivalued-results"> <title>Multivalued Results</title> <para> The <function>xpath_table</function> function assumes that the results of each XPath query might be multivalued, so the number of rows returned by the function may not be the same as the number of input documents. The first row returned contains the first result from each query, the second row the second result from each query. If one of the queries has fewer values than the others,

The xpath_table function is used in a FROM expression with an AS clause to specify output columns, and it can handle multivalued results from XPath queries, allowing for flexible querying of XML documents in a relational database, with support for various data types and operations such as aggregation, joining, and sorting.