Re: [PoC] Improve dead tuple storage for lazy vacuum

Поиск

Список

Период

Сортировка

От	Masahiko Sawada
Тема	Re: [PoC] Improve dead tuple storage for lazy vacuum
Дата	8 ноября 2022 г. 14:14:39
Msg-id	CAD21AoB8oE4WtgXhs+YFwOH3ODAOvEN48afFLYL=vqMk0d25TA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [PoC] Improve dead tuple storage for lazy vacuum (John Naylor <john.naylor@enterprisedb.com>)
Ответы	Re: [PoC] Improve dead tuple storage for lazy vacuum
Список	pgsql-hackers

Дерево обсуждения

On Sat, Nov 5, 2022 at 6:23 PM John Naylor <john.naylor@enterprisedb.com> wrote:
>
> On Fri, Nov 4, 2022 at 10:25 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > For parallel heap pruning, multiple workers will insert key-value
> > pairs to the radix tree concurrently. The simplest solution would be a
> > single lock to protect writes but the performance will not be good.
> > Another solution would be that we can divide the tables into multiple
> > ranges so that keys derived from TIDs are not conflicted with each
> > other and have parallel workers process one or more ranges. That way,
> > parallel vacuum workers can build *sub-trees* and the leader process
> > can merge them. In use cases of lazy vacuum, since the write phase and
> > read phase are separated the readers don't need to worry about
> > concurrent updates.
>
> It's a good idea to use ranges for a different reason -- readahead. See commit 56788d2156fc3, which aimed to improve
readaheadfor sequential scans. It might work to use that as a model: Each worker prunes a range of 64 pages, keeping
thedead tids in a local array. At the end of the range: lock the tid store, enter the tids into the store, unlock, free
thelocal array, and get the next range from the leader. It's possible contention won't be too bad, and I suspect using
smalllocal arrays as-we-go would be faster and use less memory than merging multiple sub-trees at the end. 

Seems a promising idea. I think it might work well even in the current
parallel vacuum (ie., single writer). I mean, I think we can have a
single lwlock for shared cases in the first version. If the overhead
of acquiring the lwlock per insertion of key-value is not negligible,
we might want to try this idea.

Apart from that, I'm going to incorporate the comments on 0004 patch
and try a pointer tagging.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [PoC] Improve dead tuple storage for lazy vacuum