Re: [PoC] Improve dead tuple storage for lazy vacuum

Поиск
Список
Период
Сортировка
От John Naylor
Тема Re: [PoC] Improve dead tuple storage for lazy vacuum
Дата
Msg-id CANWCAZYc0Ty0k+sP+Rqy1rU+x2kJdaW8ofX7Akz2=tnH7OycfA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PoC] Improve dead tuple storage for lazy vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: [PoC] Improve dead tuple storage for lazy vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Fri, Dec 15, 2023 at 3:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Dec 15, 2023 at 10:30 AM John Naylor <johncnaylorls@gmail.com> wrote:

> >  parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
> > - int nrequested_workers, int max_items,
> > - int elevel, BufferAccessStrategy bstrategy)
> > + int nrequested_workers, int vac_work_mem,
> > + int max_offset, int elevel,
> > + BufferAccessStrategy bstrategy)
> >
> > It seems very strange to me that this function has to pass the
> > max_offset. In general, it's been simpler to assume we have a constant
> > max_offset, but in this case that fact is not helping. Something to
> > think about for later.
>
> max_offset was previously used in old TID encoding in tidstore. Since
> tidstore has entries for each block, I think we no longer need it.

It's needed now to properly size the allocation of TidStoreIter which
contains...

+/* Result struct for TidStoreIterateNext */
+typedef struct TidStoreIterResult
+{
+ BlockNumber blkno;
+ int num_offsets;
+ OffsetNumber offsets[FLEXIBLE_ARRAY_MEMBER];
+} TidStoreIterResult;

Maybe we can palloc the offset array to "almost always" big enough,
with logic to resize if needed? If not too hard, seems worth it to
avoid churn in the parameter list.

> > v45-0010:
> >
> > Thinking about this some more, I'm not sure we need to do anything
> > different for the *starting* segment size. (Controlling *max* size
> > does seem important, however.) For the corner case of m_w_m = 1MB,
> > it's fine if vacuum quits pruning immediately after (in effect) it
> > finds the DSA has gone to 2MB. It's not worth bothering with, IMO. If
> > the memory accounting starts >1MB because we're adding the trivial
> > size of some struct, let's just stop doing that. The segment
> > allocations are what we care about.
>
> IIUC it's for work_mem, whose the minimum value is 64kB.
>
> >
> > v45-0011:
> >
> > + /*
> > + * max_bytes is forced to be at least 64kB, the current minimum valid
> > + * value for the work_mem GUC.
> > + */
> > + max_bytes = Max(64 * 1024L, max_bytes);
> >
> > Why?
>
> This is to avoid creating a radix tree within very small memory. The
> minimum work_mem value is a reasonable lower bound that PostgreSQL
> uses internally. It's actually copied from tuplesort.c.

There is no explanation for why it should be done like tuplesort.c. Also...

- tree->leaf_ctx = SlabContextCreate(ctx,
-    "radix tree leaves",
-    RT_SLAB_BLOCK_SIZE(sizeof(RT_VALUE_TYPE)),
-    sizeof(RT_VALUE_TYPE));
+ tree->leaf_ctx = SlabContextCreate(ctx,
+    "radix tree leaves",
+    Min(RT_SLAB_BLOCK_SIZE(sizeof(RT_VALUE_TYPE)),
+    work_mem),
+    sizeof(RT_VALUE_TYPE));

At first, my eyes skipped over this apparent re-indent, but hidden
inside here is another (undocumented) attempt to clamp the size of
something. There are too many of these sprinkled in various places,
and they're already a maintenance hazard -- a different one was left
behind in v45-0011:

@@ -201,6 +183,7 @@ TidStoreCreate(size_t max_bytes, int max_off,
dsa_area *area)
    ts->control->max_bytes = max_bytes - (70 * 1024);
  }

Let's do it in just one place. In TidStoreCreate(), do

/* clamp max_bytes to at least the size of the empty tree with
allocated blocks, so it doesn't immediately appear full */
ts->control->max_bytes = Max(max_bytes, {rt, shared_rt}_memory_usage);

Then we can get rid of all the worry about 1MB/2MB, 64kB, 70kB -- all that.

I may not recall everything while writing this, but it seems the only
other thing we should be clamping is the max aset block size (solved)
/ max DSM segment size (in progress).



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrei Lepikhov
Дата:
Сообщение: Re: introduce dynamic shared memory registry
Следующее
От: John Naylor
Дата:
Сообщение: Re: meson: Stop using deprecated way getting path of files