function
<function>xmlparse</function>:<indexterm><primary>xmlparse</primary></indexterm>
<synopsis>
XMLPARSE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable>)
</synopsis>
Examples:
<programlisting><![CDATA[
XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter></book>')
XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')
]]></programlisting>
While this is the only way to convert character strings into XML
values according to the SQL standard, the PostgreSQL-specific
syntaxes:
<programlisting><![CDATA[
xml '<foo>bar</foo>'
'<foo>bar</foo>'::xml
]]></programlisting>
can also be used.
</para>
<para>
The <type>xml</type> type does not validate input values
against a document type declaration
(DTD),<indexterm><primary>DTD</primary></indexterm>
even when the input value specifies a DTD.
There is also currently no built-in support for validating against
other XML schema languages such as XML Schema.
</para>
<para>
The inverse operation, producing a character string value from
<type>xml</type>, uses the function
<function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm>
<synopsis>
XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> [ [ NO ] INDENT ] )
</synopsis>
<replaceable>type</replaceable> can be
<type>character</type>, <type>character varying</type>, or
<type>text</type> (or an alias for one of those). Again, according
to the SQL standard, this is the only way to convert between type
<type>xml</type> and character types, but PostgreSQL also allows
you to simply cast the value.
</para>
<para>
The <literal>INDENT</literal> option causes the result to be
pretty-printed, while <literal>NO INDENT</literal> (which is the
default) just emits the original input string. Casting to a character
type likewise produces the original string.
</para>
<para>
When a character string value is cast to or from type
<type>xml</type> without going through <type>XMLPARSE</type> or
<type>XMLSERIALIZE</type>, respectively, the choice of
<literal>DOCUMENT</literal> versus <literal>CONTENT</literal> is
determined by the <quote>XML option</quote>
<indexterm><primary>XML option</primary></indexterm>
session configuration parameter, which can be set using the
standard command:
<synopsis>
SET XML OPTION { DOCUMENT | CONTENT };
</synopsis>
or the more PostgreSQL-like syntax
<synopsis>
SET xmloption TO { DOCUMENT | CONTENT };
</synopsis>
The default is <literal>CONTENT</literal>, so all forms of XML
data are allowed.
</para>
</sect2>
<sect2 id="datatype-xml-encoding-handling">
<title>Encoding Handling</title>
<para>
Care must be taken when dealing with multiple character encodings
on the client, server, and in the XML data passed through them.
When using the text mode to pass queries to the server and query
results to the client (which is the normal mode), PostgreSQL
converts all character data passed between the client and the
server and vice versa to the character encoding of the respective
end; see <xref linkend="multibyte"/>. This includes string
representations of XML values, such as in the above examples.
This would ordinarily mean that encoding declarations contained in
XML data can become invalid as the character data is converted
to other encodings while traveling between client and server,
because the embedded encoding declaration is not changed. To cope
with this behavior, encoding declarations contained in
character strings presented for input to the <type>xml</type> type
are <emphasis>ignored</emphasis>, and content is assumed
to be in the current server encoding. Consequently, for correct
processing, character strings of XML data must be sent
from the client