pg_stat_statements Query Identification and Stability

<structfield>userid</structfield>) as a more stable and reliable identifier for each entry than its query text. However, it is important to understand that there are only limited guarantees around the stability of the <structfield>queryid</structfield> hash value. Since the identifier is derived from the post-parse-analysis tree, its value is a function of, among other things, the internal object identifiers appearing in this representation. This has some counterintuitive implications. For example, <filename>pg_stat_statements</filename> will consider two apparently-identical queries to be distinct, if they reference for example a function that was dropped and recreated between the executions of the two queries. Conversely, if a table is dropped and recreated between the executions of queries, two apparently-identical queries may be considered the same. However, if the alias for a table is different for otherwise-similar queries, these queries will be considered distinct. The hashing process is also sensitive to differences in machine architecture and other facets of the platform. Furthermore, it is not safe to assume that <structfield>queryid</structfield> will be stable across major versions of <productname>PostgreSQL</productname>. </para> <para> Two servers participating in replication based on physical WAL replay can be expected to have identical <structfield>queryid</structfield> values for the same query. However, logical replication schemes do not promise to keep replicas identical in all relevant details, so <structfield>queryid</structfield> will not be a useful identifier for accumulating costs across a set of logical replicas. If in doubt, direct testing is recommended. </para> <para> Generally, it can be assumed that <structfield>queryid</structfield> values are stable between minor version releases of <productname>PostgreSQL</productname>, providing that instances are running on the same machine architecture and the catalog metadata details match. Compatibility will only be broken between minor versions as a last resort. </para> <para> The parameter symbols used to replace constants in representative query texts start from the next number after the highest <literal>$</literal><replaceable>n</replaceable> parameter in the original query text, or <literal>$1</literal> if there was none. It's worth noting that in some cases there may be hidden parameter symbols that affect this numbering. For example, <application>PL/pgSQL</application> uses hidden parameter symbols to insert values of function local variables into queries, so that a <application>PL/pgSQL</application> statement like <literal>SELECT i + 1 INTO j</literal> would have representative text like <literal>SELECT i + $2</literal>. </para> <para> The representative query texts are kept in an external disk file, and do not consume shared memory. Therefore, even very lengthy query texts can be stored successfully. However, if many long query texts are accumulated, the external file might grow unmanageably large. As a recovery method if that happens, <filename>pg_stat_statements</filename> may choose to discard the query texts, whereupon all existing entries in the <structname>pg_stat_statements</structname> view will show null <structfield>query</structfield> fields, though the statistics associated with each <structfield>queryid</structfield> are preserved. If this happens, consider reducing <varname>pg_stat_statements.max</varname> to prevent recurrences. </para> <para> <structfield>plans</structfield> and <structfield>calls</structfield> aren't always expected to match because planning and execution statistics are updated at their respective end phase, and only for successful operations. For example, if a statement is successfully planned but fails during the

The queryid hash value in pg_stat_statements has limited stability guarantees due to factors like internal object identifiers, machine architecture, and PostgreSQL version changes, and its use as an identifier has implications for replication, minor version releases, and query text representation, with considerations for parameter symbol replacement, disk storage, and statistic updating.