Re: [PoC] Improve dead tuple storage for lazy vacuum

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [PoC] Improve dead tuple storage for lazy vacuum
Дата
Msg-id CAD21AoCaDT+-ZaVjbtvumms0tyyHPNLELK2UX-MLG9XCgioaNw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PoC] Improve dead tuple storage for lazy vacuum  (John Naylor <johncnaylorls@gmail.com>)
Ответы Re: [PoC] Improve dead tuple storage for lazy vacuum  (John Naylor <johncnaylorls@gmail.com>)
Список pgsql-hackers
On Thu, Mar 21, 2024 at 7:48 PM John Naylor <johncnaylorls@gmail.com> wrote:
>
> On Thu, Mar 21, 2024 at 4:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've looked into this idea further. Overall, it looks clean and I
> > don't see any problem so far in terms of integration with lazy vacuum.
> > I've attached three patches for discussion and tests.
>
> Seems okay in the big picture, it's the details we need to be careful of.
>
> v77-0001
>
> - dead_items = (VacDeadItems *) palloc(vac_max_items_to_alloc_size(max_items));
> - dead_items->max_items = max_items;
> - dead_items->num_items = 0;
> + vacrel->dead_items = TidStoreCreate(vac_work_mem, NULL, 0);
> +
> + dead_items_info = (VacDeadItemsInfo *) palloc(sizeof(VacDeadItemsInfo));
> + dead_items_info->max_bytes = vac_work_mem * 1024L;
>
> This is confusing enough that it looks like a bug:
>
> [inside TidStoreCreate()]
> /* choose the maxBlockSize to be no larger than 1/16 of max_bytes */
> while (16 * maxBlockSize > max_bytes * 1024L)
> maxBlockSize >>= 1;
>
> This was copied from CreateWorkExprContext, which operates directly on
> work_mem -- if the parameter is actually bytes, we can't "* 1024"
> here. If we're passing something measured in kilobytes, the parameter
> is badly named. Let's use convert once and use bytes everywhere.

True. The attached 0001 patch fixes it.

>
> v77-0002:
>
> +#define dsa_create(tranch_id) \
> + dsa_create_ext(tranch_id, DSA_INITIAL_SEGMENT_SIZE, DSA_MAX_SEGMENT_SIZE)
>
> Since these macros are now referring to defaults, maybe their name
> should reflect that. Something like DSA_DEFAULT_INIT_SEGMENT_SIZE
> (*_MAX_*)

It makes sense to rename DSA_INITIAL_SEGMENT_SIZE , but I think that
the DSA_MAX_SEGMENT_SIZE is the theoretical maximum size, the current
name also makes sense to me.

>
> +/* The minimum size of a DSM segment. */
> +#define DSA_MIN_SEGMENT_SIZE ((size_t) 1024)
>
> That's a *lot* smaller than it is now. Maybe 256kB? We just want 1MB
> m_w_m to work correctly.

Fixed.

>
> v77-0003:
>
> +/* Public APIs to create local or shared TidStore */
> +
> +TidStore *
> +TidStoreCreateLocal(size_t max_bytes)
> +{
> + return tidstore_create_internal(max_bytes, false, 0);
> +}
> +
> +TidStore *
> +TidStoreCreateShared(size_t max_bytes, int tranche_id)
> +{
> + return tidstore_create_internal(max_bytes, true, tranche_id);
> +}
>
> I don't think these operations have enough in common to justify
> sharing even an internal implementation. Choosing aset block size is
> done for both memory types, but it's pointless to do it for shared
> memory, because the local context is then only used for small
> metadata.
>
> + /*
> + * Choose the DSA initial and max segment sizes to be no longer than
> + * 1/16 and 1/8 of max_bytes, respectively.
> + */
>
> I'm guessing the 1/8 here because the number of segments is limited? I
> know these numbers are somewhat arbitrary, but readers will wonder why
> one has 1/8 and the other has 1/16.
>
> + if (dsa_init_size < DSA_MIN_SEGMENT_SIZE)
> +     dsa_init_size = DSA_MIN_SEGMENT_SIZE;
> + if (dsa_max_size < DSA_MAX_SEGMENT_SIZE)
> +     dsa_max_size = DSA_MAX_SEGMENT_SIZE;
>
> The second clamp seems against the whole point of this patch -- it
> seems they should all be clamped bigger than the DSA_MIN_SEGMENT_SIZE?
> Did you try it with 1MB m_w_m?

I've incorporated the above comments and test results look good to me.

I've attached the several patches:

- 0002 is a minor fix for tidstore I found.
- 0005 changes the create APIs of tidstore.
- 0006 update the vacuum improvement patch to use the new
TidStoreCreateLocal/Shared() APIs.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Introduce XID age and inactive timeout based replication slot invalidation
Следующее
От: jian he
Дата:
Сообщение: Re: SQL:2011 application time