Re: Use streaming read API in ANALYZE

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Use streaming read API in ANALYZE
Дата
Msg-id 20240407220000.7dt4ynyyfi6fuzqy@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: Use streaming read API in ANALYZE  (Melanie Plageman <melanieplageman@gmail.com>)
Ответы Re: Use streaming read API in ANALYZE  (Melanie Plageman <melanieplageman@gmail.com>)
Список pgsql-hackers
Hi,

On 2024-04-07 16:59:26 -0400, Melanie Plageman wrote:
> From 1dc2343661f3edb3b1bc4307afb0e956397eb76c Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <melanieplageman@gmail.com>
> Date: Sun, 7 Apr 2024 14:55:22 -0400
> Subject: [PATCH v10 1/3] Make heapam_scan_analyze_next_[tuple|block] static.
> 
> 27bc1772fc81 removed the table AM callbacks scan_analyze_next_block and
> scan_analzye_next_tuple -- leaving their heap AM implementations only
> called by acquire_sample_rows().

Ugh, I don't think 27bc1772fc81 makes much sense. But that's unrelated to this
thread.  I did raise that separately
https://www.postgresql.org/message-id/20240407214001.jgpg5q3yv33ve6y3%40awork3.anarazel.de

Unless I seriously missed something, I see no alternative to reverting that
commit.


> @@ -1206,11 +1357,13 @@ acquire_sample_rows(Relation onerel, int elevel,
>                  break;
>  
>              prefetch_block = BlockSampler_Next(&prefetch_bs);
> -            PrefetchBuffer(scan->rs_rd, MAIN_FORKNUM, prefetch_block);
> +            PrefetchBuffer(scan->rs_base.rs_rd, MAIN_FORKNUM, prefetch_block);
>          }
>      }
>  #endif
>  
> +    scan->rs_cbuf = InvalidBuffer;
> +
>      /* Outer loop over blocks to sample */
>      while (BlockSampler_HasMore(&bs))
>      {

I don't think it's good to move a lot of code *and* change how it is
structured in the same commit. Makes it much harder to actually see changes /
makes git blame harder to use / etc.



> From 90d115c2401567be65bcf64393a6d3b39286779e Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <melanieplageman@gmail.com>
> Date: Sun, 7 Apr 2024 15:28:32 -0400
> Subject: [PATCH v10 2/3] Use streaming read API in ANALYZE
>
> The ANALYZE command prefetches and reads sample blocks chosen by a
> BlockSampler algorithm. Instead of calling Prefetch|ReadBuffer() for
> each block, ANALYZE now uses the streaming API introduced in b5a9b18cd0.
>
> Author: Nazir Bilal Yavuz
> Reviewed-by: Melanie Plageman
> Discussion: https://postgr.es/m/flat/CAN55FZ0UhXqk9v3y-zW_fp4-WCp43V8y0A72xPmLkOM%2B6M%2BmJg%40mail.gmail.com
> ---
>  src/backend/commands/analyze.c | 89 ++++++++++------------------------
>  1 file changed, 26 insertions(+), 63 deletions(-)

That's a very nice demonstration of how this makes good prefetching easier...




> From 862b7ac81cdafcda7b525e02721da14e46265509 Mon Sep 17 00:00:00 2001
> From: Melanie Plageman <melanieplageman@gmail.com>
> Date: Sun, 7 Apr 2024 15:38:41 -0400
> Subject: [PATCH v10 3/3] Obsolete BlockSampler_HasMore()
> 
> A previous commit stopped using BlockSampler_HasMore() for flow control
> in acquire_sample_rows(). There seems little use now for
> BlockSampler_HasMore(). It should be sufficient to return
> InvalidBlockNumber from BlockSampler_Next() when BlockSample_HasMore()
> would have returned false. Remove BlockSampler_HasMore().
> 
> Author: Melanie Plageman, Nazir Bilal Yavuz
> Discussion: https://postgr.es/m/flat/CAN55FZ0UhXqk9v3y-zW_fp4-WCp43V8y0A72xPmLkOM%2B6M%2BmJg%40mail.gmail.com

The justification here seems somewhat odd. Sure, the previous commit stopped
using BlockSampler_HasMore in acquire_sample_rows - but only because it was
moved to block_sampling_streaming_read_next()?

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Melih Mutlu
Дата:
Сообщение: Re: Flushing large data immediately in pqcomm
Следующее
От: "David E. Wheeler"
Дата:
Сообщение: Re: ❓ JSON Path Dot Precedence