Re: Eliminate redundant tuple visibility check in vacuum

Поиск
Список
Период
Сортировка
От David Geier
Тема Re: Eliminate redundant tuple visibility check in vacuum
Дата
Msg-id c9d30046-97b0-2db2-b45e-2aa82386c316@gmail.com
обсуждение исходный текст
Ответ на Re: Eliminate redundant tuple visibility check in vacuum  (Melanie Plageman <melanieplageman@gmail.com>)
Ответы Re: Eliminate redundant tuple visibility check in vacuum  (Melanie Plageman <melanieplageman@gmail.com>)
Список pgsql-hackers
Hi Melanie,

On 8/31/23 02:59, Melanie Plageman wrote:
>> I created a large table and then updated a tuple on every page in the
>> relation and vacuumed it. I saw a consistent slight improvement in
>> vacuum execution time. I profiled a bit with perf stat as well. The
>> difference is relatively small for this kind of example, but I can
>> work on a more compelling, realistic example. I think eliminating the
>> retry loop is also useful, as I have heard that users have had to
>> cancel vacuums which were in this retry loop for too long.
> Just to provide a specific test case, if you create a small table like this
>
> create table foo (a int, b int, c int) with(autovacuum_enabled=false);
> insert into foo select i, i, i from generate_series(1, 10000000);
>
> And then vacuum it. I find that with my patch applied I see a
> consistent ~9% speedup (averaged across multiple runs).
>
> master: ~533ms
> patch: ~487ms
>
> And in the profile, with my patch applied, you notice less time spent
> in HeapTupleSatisfiesVacuumHorizon()
>
> master:
>      11.83%  postgres  postgres           [.] heap_page_prune
>      11.59%  postgres  postgres           [.] heap_prepare_freeze_tuple
>       8.77%  postgres  postgres           [.] lazy_scan_prune
>       8.01%  postgres  postgres           [.] HeapTupleSatisfiesVacuumHorizon
>       7.79%  postgres  postgres           [.] heap_tuple_should_freeze
>
> patch:
>      13.41%  postgres  postgres           [.] heap_prepare_freeze_tuple
>       9.88%  postgres  postgres           [.] heap_page_prune
>       8.61%  postgres  postgres           [.] lazy_scan_prune
>       7.00%  postgres  postgres           [.] heap_tuple_should_freeze
>       6.43%  postgres  postgres           [.] HeapTupleSatisfiesVacuumHorizon

Thanks a lot for providing additional information and the test case.

I tried it on a release build and I also see a 10% speed-up. I reset the 
visibility map between VACUUM runs, see:

CREATE EXTENSION pg_visibility; CREATE TABLE foo (a INT, b INT, c INT) 
WITH(autovacuum_enabled=FALSE); INSERT INTO foo SELECT i, i, i from 
generate_series(1, 10000000) i; VACUUM foo; SELECT 
pg_truncate_visibility_map('foo'); VACUUM foo; SELECT 
pg_truncate_visibility_map('foo'); VACUUM foo; ...

The first patch, which refactors the code so we can pass the result of 
the visibility checks to the caller, looks good to me.

Regarding the 2nd patch (disclaimer: I'm not too familiar with that area 
of the code): I don't completely understand why the retry loop is not 
needed anymore and how you now detect/handle the possible race 
condition? It can still happen that an aborting transaction changes the 
state of a row after heap_page_prune() looked at that row. Would that 
case now not be ignored?

-- 
David Geier
(ServiceNow)




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: persist logical slots to disk during shutdown checkpoint
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Sync scan & regression tests