BUG #15707: Incorrect timing information in EXPLAIN ANALYZE output
От | PG Bug reporting form |
---|---|
Тема | BUG #15707: Incorrect timing information in EXPLAIN ANALYZE output |
Дата | |
Msg-id | 15707-caa4da87391c00ab@postgresql.org обсуждение исходный текст |
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 15707 Logged by: David Geier Email address: david@swarm64.com PostgreSQL version: 11.2 Operating system: Ubuntu 18.04 Description: Hi all, While looking carefully at the EXPLAIN ANALYZE output we came across numbers for "actual total time" (the second number of the "actual time" range) which we believe don't add up. The instrumentation code starts measuring execution time of a plan node before the corresponding ExecProcNode() function pointer is invoked and stops after it has completed. This happens recursively on the query plan while pulling up tuples Volcano-style. Accordingly, the "actual total time" of any plan node can never be smaller than the sum of "actual total time"s of all child plan nodes. If you will, the "actual total time" of a plan node is its inclusive execution time. The difference with its child plan nodes is its exclusive execution time. In the following plan you can see that this doesn't hold even for a simple query like EXPLAIN ANALYZE SELECT COUNT(*) FROM orders WHERE o_orderkey < 10; Finalize Aggregate (cost=184906.47..184906.48 rows=1 width=8) (actual time=283.924..283.925 rows=1 loops=1) <-- actual total time = 283.925ms -> Gather (cost=184906.05..184906.46 rows=4 width=8) (actual time=156.571..308.595 rows=4 loops=1) <-- actual total time = 308.595ms Workers Planned: 4 Workers Launched: 4 -> Partial Aggregate (cost=184406.05..184406.06 rows=1 width=8) (actual time=181.446..181.446 rows=1 loops=4) -> Parallel Foreign Scan on orders (cost=500.00..184406.00 rows=19 width=0) (actual time=147.988..181.438 rows=2 loops=4) Rows in table: 15000000 Rows in snapshot: 15000000 Cluster sets in snapshot: 4458 Rows returned: 7 Planning Time: 2.565 ms Execution Time: 308.833 ms Looking through the code we came across ExecShutdownNode(), which uses the instrumentation functions to include shutdown time into the total execution time. Though the big difference to the use in ExecProcNode() is that ExecShutdownNode() only invokes InstrStartNode() AFTER it has already recursed into its children whereas ExecProcNode() calls it before. This results in the measured shutdown time for a plan node being exclusive instead of inclusive; but its still being added to the timing information obtained from executing ExecProcNode(). We believe a fix to that problem is to move if (node->instrument && node->instrument->running) InstrStartNode(node->instrument); above the recursion planstate_tree_walker(node, ExecShutdownNode, NULL); Regards, David
В списке pgsql-bugs по дате отправления: