Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)

Поиск
Список
Период
Сортировка
От Melanie Plageman
Тема Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Дата
Msg-id CAAKRu_YS+Ocm=OzMaZnG4egFiE9v4VYfZ25DXd6jbwegqmGYbQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)  (Kirill Reshke <reshkekirill@gmail.com>)
Список pgsql-hackers
Thanks for the review!

On Wed, Oct 29, 2025 at 7:03 AM Kirill Reshke <reshkekirill@gmail.com> wrote:
>
> get_conflict_xid function logic is bit strange for me, it assigns
> conflict_xid to some value,  but in the very end we have
>
> > + /*
> >+ * We can omit the snapshot conflict horizon if we are not pruning or
> >+ * freezing any tuples and are setting an already all-visible page
> >+ * all-frozen in the VM. In this case, all of the tuples on the page must
> >+ * already be visible to all MVCC snapshots on the standby.
> >+ */
> >+ if (!do_prune && !do_freeze &&
> >+ do_set_vm && blk_already_av && set_blk_all_frozen)
> > + conflict_xid = InvalidTransactionId;
>
> I feel like we should move this check to the beginning of the
> function, and just  return InvalidTransactionId in that if cond.

You're right. I've changed it as you suggest in attached v19.

> > + if (old_vmbits == new_vmbits)
> > + {
> > + LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
> > + /* Unset so we don't emit WAL since no change occurred */
> > + do_set_vm = false;
> > + }
>
> and then
>
> >  END_CRIT_SECTION();
> > + if (do_set_vm)
> > + LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
> > +
>
> So, in the heap_page_prune_and_freeze function we release buffer lock
> both inside and outside the crit section. As I understand, this is
> actually safe. I also looked in other xlog coding practices for other
> access methods (GiST, GIN, ....), and I can see that some of them
> release buffers before leaving crit sections and some of them after.
> But I still suggest to be in sync with 'Write-Ahead Log Coding'
> section of
> src/backend/access/transam/README, which says:
>
> 6. END_CRIT_SECTION()
>
> 7. Unlock and unpin the buffer(s).
>
> Let's be consistent in this at least in this single function context.

I see what you are saying. However, I don't see a good way to
determine whether or not we need to unlock the VM without introducing
another local variable in the outermost scope -- like "unlock_vm".
This function already has a lot of local variables, so I'm loath to do
that. And we want do_set_vm to reflect whether or not we actually set
it in case it gets used in the future.

This function doesn't lock or unlock the heap buffer so it doesn't
seem as urgent to me to follow the letter of the law in this case.

Attached patch doesn't have this change, but this is what it would look like:

    /* Lock vmbuffer before entering a critical section */
+   unlock_vm = do_set_vm;
    if (do_set_vm)
        LockBuffer(vmbuffer, BUFFER_LOCK_EXCLUSIVE);

@@ -1112,12 +1114,9 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,
            old_vmbits = visibilitymap_set(blockno,
                                           vmbuffer, new_vmbits,
                                           params->relation->rd_locator);
-           if (old_vmbits == new_vmbits)
-           {
-               LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);
-               /* Unset so we don't emit WAL since no change occurred */
-               do_set_vm = false;
-           }
+
+           /* Unset so we don't emit WAL since no change occurred */
+           do_set_vm = old_vmbits != new_vmbits;
        }

        /*
@@ -1145,7 +1144,7 @@ heap_page_prune_and_freeze(PruneFreezeParams *params,

    END_CRIT_SECTION();

-   if (do_set_vm)
+   if (unlock_vm)
        LockBuffer(vmbuffer, BUFFER_LOCK_UNLOCK);

> In 0010:
>
> I'm not terribly convenient that adding SO_ALLOW_VM_SET to TAM
> ScanOptions is the right thing to do. Looks like VM bits are something
> that make sense for HEAP AM for not for any TAM. So, don't we break
> some layer of abstraction here? Would it be better for HEAP AM to set
> some flags in heap_beginscan?

I don't see another good way of doing it.

The information about whether or not the relation is modified in the
query is gathered during planning and saved in the plan. We need to
get that information to the scan descriptor, which is all we have when
we call heap_page_prune_opt() during the scan. The scan descriptor is
created by the table AM implementations of scan_begin(). The table AM
callbacks don't pass down the plan -- which makes sense; the scan
shouldn't know about the plan. They do pass down flags, so I thought
it made the most sense to add a flag. Note that I was able to avoid
modifying the actual table and index AM callbacks (scan_begin() and
ambeginscan()). I only made new wrappers that took "modifies_rel".

Now, it is true that referring to the VM is somewhat of a layering
violation. Though, other table AMs may use the information about if
the query modifies the relation -- which is really what this flag
represents. The ScanOptions are usually either a type or a call to
action. Which is why I felt a bit uncomfortable calling it something
like SO_MODIFIES_REL -- which is less of an option and more a piece of
information. And it makes it sound like the scan modifies the rel,
which is not the case. I wonder if there is another solution. Or maybe
we call it SO_QUERY_MODIFIES_REL?

- Melanie

Вложения

В списке pgsql-hackers по дате отправления: