Re: [PATCH] Add extra statistics to explain for Nested Loop

Поиск
Список
Период
Сортировка
От Yugo NAGATA
Тема Re: [PATCH] Add extra statistics to explain for Nested Loop
Дата
Msg-id 20210201221315.06393d58e8205bc18bd0b84b@sraoss.co.jp
обсуждение исходный текст
Ответ на Re: [PATCH] Add extra statistics to explain for Nested Loop  (Julien Rouhaud <rjuju123@gmail.com>)
Список pgsql-hackers
On Mon, 1 Feb 2021 13:28:45 +0800
Julien Rouhaud <rjuju123@gmail.com> wrote:

> On Thu, Jan 28, 2021 at 8:38 PM Yugo NAGATA <nagata@sraoss.co.jp> wrote:
> >
> > postgres=# explain (analyze, verbose) select * from a,b where a.i=b.j;
> >                                                                                 QUERY PLAN
> >
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >  Nested Loop  (cost=0.00..2752.00 rows=991 width=8) (actual time=0.021..17.651 rows=991 loops=1)
> >    Output: a.i, b.j
> >    Join Filter: (a.i = b.j)
> >    Rows Removed by Join Filter: 99009
> >    ->  Seq Scan on public.b  (cost=0.00..2.00 rows=100 width=4) (actual time=0.009..0.023 rows=100 loops=1)
> >          Output: b.j
> >    ->  Seq Scan on public.a  (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 min_time=0.065
max_time=0.163min_rows=1000 rows=1000 max_rows=1000 loops=100)
 
> >          Output: a.i
> >  Planning Time: 0.066 ms
> >  Execution Time: 17.719 ms
> > (10 rows)
> >
> > I don't like this format where the extra statistics appear in the same
> > line of existing information because the output format differs depended
> > on whether the plan node's loops > 1 or not. This makes the length of a
> > line too long. Also, other information reported by VERBOSE doesn't change
> > the exiting row format and just add extra rows for new information.
> >
> > Instead, it seems good for me to add extra rows for the new statistics
> > without changint the existing row format as other VERBOSE information,
> > like below.
> >
> >    ->  Seq Scan on public.a  (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 rows=1000  loops=100)
> >          Output: a.i
> >          Min Time: 0.065 ms
> >          Max Time: 0.163 ms
> >          Min Rows: 1000
> >          Max Rows: 1000
> >
> > or, like Buffers,
> >
> >    ->  Seq Scan on public.a  (cost=0.00..15.00 rows=1000 width=4) (actual time=0.005..0.091 rows=1000  loops=100)
> >          Output: a.i
> >          Loops: min_time=0.065 max_time=0.163 min_rows=1000 max_rows=1000
> >
> > and so  on. What do you think about it?
> 
> It's true that the current output is a bit long, which isn't really
> convenient to read.  Using one of those alternative format would also
> have the advantage of not breaking compatibility with tools that
> process those entries.  I personally prefer the 2nd option with the
> extra "Loops:" line .  For non text format, should we keep the current
> format?

For non text format, I think "Max/Min Rows", "Max/Min Times" are a bit
simple and the meaning is unclear. Instead, similar to a style of "Buffers",
does it make sense using "Max/Min Rows in Loops" and "Max/Min Times in Loops"?

Regards,
Yugo Nagata

-- 
Yugo NAGATA <nagata@sraoss.co.jp>



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Nancarrow
Дата:
Сообщение: Re: Parallel INSERT (INTO ... SELECT ...)
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: pgbench stopped supporting large number of client connections on Windows