Обсуждение: V18 change on EXPLAIN ANALYZE
Robert Hass committed ddb17e387aa28d61521227377b00f997756b8a27, which changed how EXPLAIN indicates fractional rows. That's cool but I have to see that change on sources because it's not explained on DOCs.
One can understand that cost=4.36, that actual time=0.009, but what means rows=10.43 ? Costs and Time having decimals are fine but what is a row and a half ?
So, I don't know exactly how to explain this change but I think it would be good to have a new paragraph on [1] and explain that now when nloops > 1, we now display two digits after the decimal point, rather than none.
regards
Marcos
On Fri, Sep 26, 2025 at 1:34 PM Marcos Pegoraro <marcos@f10.com.br> wrote: > Robert Hass committed ddb17e387aa28d61521227377b00f997756b8a27, which changed how EXPLAIN indicates fractional rows. That'scool but I have to see that change on sources because it's not explained on DOCs. > > One can understand that cost=4.36, that actual time=0.009, but what means rows=10.43 ? Costs and Time having decimals arefine but what is a row and a half ? > > So, I don't know exactly how to explain this change but I think it would be good to have a new paragraph on [1] and explainthat now when nloops > 1, we now display two digits after the decimal point, rather than none. > > [1] - https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN-ANALYZE The page you link says In some query plans, it is possible for a subplan node to be executed more than once. For example, the inner index scan will be executed once per outer row in the above nested-loop plan. In such cases, the loops value reports the total number of executions of the node, and the actual time and rows values shown are averages per-execution. This is done to make the numbers comparable with the way that the cost estimates are shown. Multiply by the loops value to get the total time actually spent in the node. In the above example, we spent a total of 0.030 milliseconds executing the index scans on tenk2. in the second paragraph after the example in this section. Do you think that's not sufficiently clear? Thanks, Maciek
Maciek Sakrejda <m.sakrejda@gmail.com> writes: > The page you link says > In some query plans, it is possible for a subplan node to be > executed more than once. For example, the inner index scan will be > executed once per outer row in the above nested-loop plan. In such > cases, the loops value reports the total number of executions of the > node, and the actual time and rows values shown are averages > per-execution. This is done to make the numbers comparable with the > way that the cost estimates are shown. Multiply by the loops value to > get the total time actually spent in the node. In the above example, > we spent a total of 0.030 milliseconds executing the index scans on > tenk2. > in the second paragraph after the example in this section. Do you > think that's not sufficiently clear? It's not wrong, but it feels a little incomplete now. Maybe change the last two sentences to Multiply by the loops value to get the total time actually spent in the node and the total number of rows processed by the node across all executions. In the above example, we spent a total of 0.030 milliseconds executing the index scans on tenk2, and they handled a total of 10 rows. A bigger gap in perform.sgml is that it doesn't address parallel query cases at all AFAICS. I think that was one of the main drivers of this change, so it feels a little sad that it's not covered here. regards, tom lane
On Fri, Sep 26, 2025 at 2:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Maciek Sakrejda <m.sakrejda@gmail.com> writes: > > The page you link says > > > In some query plans, it is possible for a subplan node to be > > executed more than once. For example, the inner index scan will be > > executed once per outer row in the above nested-loop plan. In such > > cases, the loops value reports the total number of executions of the > > node, and the actual time and rows values shown are averages > > per-execution. This is done to make the numbers comparable with the > > way that the cost estimates are shown. Multiply by the loops value to > > get the total time actually spent in the node. In the above example, > > we spent a total of 0.030 milliseconds executing the index scans on > > tenk2. > > > in the second paragraph after the example in this section. Do you > > think that's not sufficiently clear? > > It's not wrong, but it feels a little incomplete now. Maybe change > the last two sentences to > > Multiply by the loops value to get the total time actually spent in > the node and the total number of rows processed by the node across all > executions. In the above example, we spent a total of 0.030 > milliseconds executing the index scans on tenk2, and they handled a > total of 10 rows. > > A bigger gap in perform.sgml is that it doesn't address parallel > query cases at all AFAICS. I think that was one of the main drivers > of this change, so it feels a little sad that it's not covered here. Fair point. I included your proposed change and took a stab at briefly covering parallelism in the attached (admittedly, my understanding of how that works is a little shaky, so apologies if I'm way off on some of this). However, to get a parallel query in the regression database (I chose EXPLAIN ANALYZE SELECT * FROM tenk2), I had to change some settings: SET min_parallel_table_scan_size = 0; SET parallel_tuple_cost = 0; SET parallel_setup_cost = 0; Should I mention that in the example? Or should I generate a bigger table so using these is not necessary? If we say nothing and use the example, I think it may be confusing if someone wants to use the example as a starting point for their own exploration of how this works. Or is there a better query that works out of the box and does not need changes to the settings? It also seems like the EXPLAIN ANALYZE section is getting a little unwieldy. Should we subdivide it, or is this still okay? Thanks, Maciek