pg_test_timing: Measuring Executor Timing Overhead and Changing Time Sources

</para> <para> Note that different units are used for the per loop time than the histogram. The loop can have resolution within a few nanoseconds (ns), while the individual timing calls can only resolve down to one microsecond (us). </para> </refsect2> <refsect2> <title>Measuring Executor Timing Overhead</title> <para> When the query executor is running a statement using <command>EXPLAIN ANALYZE</command>, individual operations are timed as well as showing a summary. The overhead of your system can be checked by counting rows with the <application>psql</application> program: <screen> CREATE TABLE t AS SELECT * FROM generate_series(1,100000); \timing SELECT COUNT(*) FROM t; EXPLAIN ANALYZE SELECT COUNT(*) FROM t; </screen> </para> <para> The i7-860 system measured runs the count query in 9.8 ms while the <command>EXPLAIN ANALYZE</command> version takes 16.6 ms, each processing just over 100,000 rows. That 6.8 ms difference means the timing overhead per row is 68 ns, about twice what pg_test_timing estimated it would be. Even that relatively small amount of overhead is making the fully timed count statement take almost 70% longer. On more substantial queries, the timing overhead would be less problematic. </para> </refsect2> <refsect2> <title>Changing Time Sources</title> <para> On some newer Linux systems, it's possible to change the clock source used to collect timing data at any time. A second example shows the slowdown possible from switching to the slower acpi_pm time source, on the same system used for the fast results above: <screen><![CDATA[ # cat /sys/devices/system/clocksource/clocksource0/available_clocksource tsc hpet acpi_pm # echo acpi_pm > /sys/devices/system/clocksource/clocksource0/current_clocksource # pg_test_timing Per loop time including overhead: 722.92 ns Histogram of timing durations: < us % of total count 1 27.84870 1155682 2 72.05956 2990371 4 0.07810 3241 8 0.01357 563 16 0.00007 3 ]]></screen> </para> <para> In this configuration, the sample <command>EXPLAIN ANALYZE</command> above takes 115.9 ms. That's 1061 ns of timing overhead, again a small multiple of what's measured directly by this utility. That much timing overhead means the actual query itself is only taking a tiny fraction of the accounted for time, most of it is being consumed in overhead instead. In this configuration, any <command>EXPLAIN ANALYZE</command> totals involving many timed operations would be inflated significantly by timing overhead. </para> <para> FreeBSD also allows changing the time source on the fly, and it logs information about the timer selected during boot: <screen> # dmesg | grep "Timecounter" Timecounter "ACPI-fast" frequency 3579545 Hz quality 900 Timecounter "i8254" frequency 1193182 Hz quality 0 Timecounters tick every 10.000 msec Timecounter "TSC" frequency 2531787134 Hz quality 800 # sysctl kern.timecounter.hardware=TSC kern.timecounter.hardware: ACPI-fast -> TSC </screen> </para> <para> Other systems may only allow setting the time source on boot. On older Linux systems the "clock" kernel setting is the only way to make this sort of change. And even on some more recent ones, the only option you'll see for a clock source is "jiffies". Jiffies are the older Linux software clock implementation, which can have good resolution when it's backed by fast enough timing hardware, as in this example: <screen><![CDATA[ $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource jiffies $ dmesg | grep time.c time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer. time.c: Detected

The query executor times individual operations using EXPLAIN ANALYZE. The system's overhead can be checked by counting rows with psql. A difference between the count query and EXPLAIN ANALYZE versions indicates timing overhead. The tool also demonstrates how changing the clock source can significantly impact timing performance, with slower sources like acpi_pm leading to inflated timing results. FreeBSD and other systems allow changing the time source, with FreeBSD logging the timer selected during boot. Older Linux systems may only allow setting the time source on boot.