Re: Fix parallel vacuum buffer usage reporting

Поиск
Список
Период
Сортировка
От Alena Rybakina
Тема Re: Fix parallel vacuum buffer usage reporting
Дата
Msg-id a28c7777-a6e9-4c8c-bc34-44437d7e6f6c@yandex.ru
обсуждение исходный текст
Ответ на Re: Fix parallel vacuum buffer usage reporting  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: Fix parallel vacuum buffer usage reporting  (Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>)
Список pgsql-hackers

Hi!

On 30.04.2024 05:18, Masahiko Sawada wrote:
On Fri, Apr 26, 2024 at 9:12 PM Alena Rybakina <lena.ribackina@yandex.ru> wrote:
Hi!
The same script was run, but using vacuum verbose analyze, and I saw the difference again in the fifth step:
with your patch: buffer usage: 32312 hits, 607 misses, 1566 dirtied
master: buffer usage: 32346 hits, 573 misses, 1360 dirtied
Isn't there a chance for the checkpointer to run during this time? That could make the conditions between the two runs slightly different and explain the change in buffer report.

[0] https://github.com/postgres/postgres/blob/8a1b31e6e59631807a08a4e9465134c343bbdf5e/src/backend/access/heap/vacuumlazy.c#L2826-L2831

Looking at the script, you won't trigger the problem.

Thank you for the link I accounted it in my next experiments.

I repeated the test without processing checkpoints with a single index, and the number of pages in the buffer used almost matched:

master branch: buffer usage: 32315 hits, 606 misses, 4486 dirtied

with applied patch v4 version: buffer usage: 32315 hits, 606 misses, 4489 dirtied

I think you are right - the problem was interfering with the checkpoint process, by the way I checked the first version patch. To cut a long story short, everything is fine now with one index.

Just in case, I'll explain: I considered this case because your patch could have impact influenced it too.

On 25.04.2024 10:17, Anthonin Bonnefoy wrote:


On Wed, Apr 24, 2024 at 4:01 PM Alena Rybakina <lena.ribackina@yandex.ru> wrote:
I tested the main postgres branch with and without your fix using a script that was written by me. It consists of five scenarios and I made a comparison in the logs between the original version of the master branch and the master branch with your patch:
 Hi! Thanks for the tests.

I have attached a test file (vacuum_check_logs.sql)
The reporting issue will only happen if there's a parallel index vacuum and it will only happen if there's at least 2 indexes [0]. You will need to create an additional index.

Speaking of the problem, I added another index and repeated the test and found a significant difference:

I found it when I commited the transaction (3):

master: 2964 hits, 0 misses, 0 dirtied

with applied patch v4 version: buffer usage: 33013 hits, 0 misses, 3 dirtied

When I deleted all the data from the table and later started vacuum verbose again (4):

master: buffer usage: 51486 hits, 0 misses, 0 dirtied

with applied patch v4 version:buffer usage: 77924 hits, 0 misses, 0 dirtied

when I inserted 1 million data into the table and updated it (5):

master:buffer usage: 27904 hits, 5021 misses, 1777 dirtied

with applied patch v4 version:buffer usage: 41051 hits, 9973 misses, 2564 dirtied

As I see, the number of pages is significantly more than it was in the master branch and ,frankly, I couldn't fully figure out if it was a mistake or not.
I think that the patch fixes the problem correctly.

I've run pgindent and updated the commit message. I realized that
parallel vacuum was introduced in pg13 but buffer usage reporting in
VACUUM command was implemented in pg15. Therefore, in pg13 and pg14,
VACUUM (PARALLEL) is available but VACUUM (PARALLEL, VERBOSE) doesn't
show the buffer usage report. Autovacuum does show the buffer usage
report but parallel autovacuum is not supported. Therefore, we should
backpatch it down to 15, not 13.

I'm going to push the patch down to pg15, barring any objections.

Regards,
I agree with you about porting and I saw that the patch is working correctly.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: pg_sequence_last_value() for unlogged sequences on standbys
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: partitioning and identity column