Home Explore Blog CI



postgresql

9th chunk of `doc/src/sgml/pgstatstatements.sgml`
0dbff137a67ef1d079d39d3185517becf488af379fdc61070000000100000fa3
 <structfield>userid</structfield>) as a more stable
   and reliable identifier for each entry than its query text.
   However, it is important to understand that there are only limited
   guarantees around the stability of the <structfield>queryid</structfield> hash
   value.  Since the identifier is derived from the
   post-parse-analysis tree, its value is a function of, among other
   things, the internal object identifiers appearing in this representation.
   This has some counterintuitive implications.  For example,
   <filename>pg_stat_statements</filename> will consider two apparently-identical
   queries to be distinct, if they reference for example a function that was
   dropped and recreated between the executions of the two queries.
   Conversely, if a table is dropped and recreated between the
   executions of queries, two apparently-identical queries may be
   considered the same. However, if the alias for a table is different
   for otherwise-similar queries, these queries will be considered
   distinct.
   The hashing process is also sensitive to differences in
   machine architecture and other facets of the platform.
   Furthermore, it is not safe to assume that <structfield>queryid</structfield>
   will be stable across major versions of <productname>PostgreSQL</productname>.
  </para>

  <para>
   Two servers participating in replication based on physical WAL replay can
   be expected to have identical <structfield>queryid</structfield> values for
   the same query.  However, logical replication schemes do not promise to
   keep replicas identical in all relevant details, so
   <structfield>queryid</structfield> will not be a useful identifier for
   accumulating costs across a set of logical replicas.
   If in doubt, direct testing is recommended.
  </para>

  <para>
   Generally, it can be assumed that <structfield>queryid</structfield> values
   are stable between minor version releases of <productname>PostgreSQL</productname>,
   providing that instances are running on the same machine architecture and
   the catalog metadata details match.  Compatibility will only be broken
   between minor versions as a last resort.
  </para>

  <para>
   The parameter symbols used to replace constants in
   representative query texts start from the next number after the
   highest <literal>$</literal><replaceable>n</replaceable> parameter in the original query
   text, or <literal>$1</literal> if there was none.  It's worth noting that in
   some cases there may be hidden parameter symbols that affect this
   numbering.  For example, <application>PL/pgSQL</application> uses hidden parameter
   symbols to insert values of function local variables into queries, so that
   a <application>PL/pgSQL</application> statement like <literal>SELECT i + 1 INTO j</literal>
   would have representative text like <literal>SELECT i + $2</literal>.
  </para>

  <para>
   The representative query texts are kept in an external disk file, and do
   not consume shared memory.  Therefore, even very lengthy query texts can
   be stored successfully.  However, if many long query texts are
   accumulated, the external file might grow unmanageably large.  As a
   recovery method if that happens, <filename>pg_stat_statements</filename> may
   choose to discard the query texts, whereupon all existing entries in
   the <structname>pg_stat_statements</structname> view will show
   null <structfield>query</structfield> fields, though the statistics associated with
   each <structfield>queryid</structfield> are preserved.  If this happens, consider
   reducing <varname>pg_stat_statements.max</varname> to prevent
   recurrences.
  </para>

  <para>
   <structfield>plans</structfield> and <structfield>calls</structfield> aren't
   always expected to match because planning and execution statistics are
   updated at their respective end phase, and only for successful operations.
   For example, if a statement is successfully planned but fails during
   the

Title: pg_stat_statements Query Identification and Stability
Summary
The queryid hash value in pg_stat_statements has limited stability guarantees due to factors like internal object identifiers, machine architecture, and PostgreSQL version changes, and its use as an identifier has implications for replication, minor version releases, and query text representation, with considerations for parameter symbol replacement, disk storage, and statistic updating.