Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .
Дата
Msg-id CAA4eK1L4fTEebm5dvVw3G04Ksv-ya+QinnuuC7+97Q6xa7ASFQ@mail.gmail.com
обсуждение исходный текст
Ответ на Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .  (Masahiko Sawada <sawada.mshk@gmail.com>)
Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .  (Satoshi Nagayasu <snaga@uptime.jp>)
Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Список pgsql-hackers
On Mon, Jun 20, 2016 at 11:48 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi all,
>
> My colleague noticed that the output of EXPLAIN ANALYZE doesn't work
> fine for parallel seq scan.
>
> postgres(1)=# explain analyze verbose select count(*) from pgbench_accounts ;
>                                                     QUERY PLAN
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>  Finalize Aggregate  (cost=217018.55..217018.56 rows=1 width=8)
> (actual time=2640.015..2640.015 rows=1 loops=1)
>    Output: count(*)
>    ->  Gather  (cost=217018.33..217018.54 rows=2 width=8) (actual
> time=2639.064..2640.002 rows=3 loops=1)
>          Output: (PARTIAL count(*))
>          Workers Planned: 2
>          Workers Launched: 2
>          ->  Partial Aggregate  (cost=216018.33..216018.34 rows=1
> width=8) (actual time=2632.714..2632.715 rows=1 loops=3)
>                Output: PARTIAL count(*)
>                Worker 0: actual time=2632.583..2632.584 rows=1 loops=1
>                Worker 1: actual time=2627.517..2627.517 rows=1 loops=1
>                ->  Parallel Seq Scan on public.pgbench_accounts
> (cost=0.00..205601.67 rows=4166667 width=0) (actual
> time=0.042..1685.542 rows=3333333 loops=3)
>                      Worker 0: actual time=0.033..1657.486 rows=3457968 loops=1
>                      Worker 1: actual time=0.039..1702.979 rows=3741069 loops=1
>  Planning time: 1.026 ms
>  Execution time: 2640.225 ms
> (15 rows)
>
> For example, the above result shows,
> Parallel Seq Scan : actual rows = 3333333
> worker 0               : actual rows = 3457968
> worker 1               : actual rows = 3741069
> Summation of these is 10532370, but actual total rows is 10000000.
> I think that Parallel Seq Scan should show actual rows =
> 10000000(total rows) or actual rows = 2800963(rows collected by
> itself). (10000000 maybe better)
>

You have to read the rows at Parallel Seq Scan nodes as total count of rows, but you have to consider the loops parameter as well. 
 
>
> After spent time to investigate this behaviour, ISTM that the problem
> is nloops of Parallel Seq Scan.
> Parallel Seq Scan is done only once, but nloops is incremented to 3.
>

nloops here indicates, that it is done for 2 workers and a master backend.

> So its "actual rows" is calculated 3333333(10000000 / 3) at explain.c:L1223.
>

Thats how it should be considered.  You might want to compare the behaviour with other cases where value of nloops is used.

> Is it a bug?
>

I don't think so.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: Actuall row count of Parallel Seq Scan in EXPLAIN ANALYZE .