Обсуждение: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache

Поиск
Список
Период
Сортировка

Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache

От
Nazir Bilal Yavuz
Дата:
Hi,

There is another thread [1] to add both pg_buffercache_evict_[relation
| all] and pg_buffercache_mark_dirty[_all] functions to the
pg_buffercache. I decided to create another thread as
pg_buffercache_evict_[relation | all] functions are committed but
pg_buffercache_mark_dirty[_all] functions still need review.

pg_buffercache_mark_dirty(): This function takes a buffer id as an
argument and tries to mark this buffer as dirty. Returns true on
success.
pg_buffercache_mark_dirty_all(): This is very similar to the
pg_buffercache_mark_dirty() function. The difference is
pg_buffercache_mark_dirty_all() does not take an argument. Instead it
just loops over the shared buffers and tries to mark all of them as
dirty. It returns the number of buffers marked as dirty.

Since that patch is targeted for the PG 19, pg_buffercache is bumped to v1.7.

Latest version is attached and people who already reviewed the patches are CCed.

[1] postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw%40mail.gmail.com

--
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache

От
Xuneng Zhou
Дата:
Hi,
I’ve been trying to review this patch, and it struck me that we’re currently grabbing the content lock exclusively just to flip a header bit:

if (!(buf_state & BM_DIRTY))
{
    LWLockAcquire(BufferDescriptorGetContentLock(desc), LW_EXCLUSIVE);
    MarkBufferDirty(buf);
    LWLockRelease(BufferDescriptorGetContentLock(desc));
}

Since our sole goal here is to simulate a buffer’s dirty state for testing/debugging, I wonder—could we instead:

1. Acquire the shared content lock (which already blocks eviction),  
2. Call MarkBufferDirtyHint() to flip the BM_DIRTY bit under the header spinlock, and  
3. Release the shared lock?

This seems to satisfy Assert(LWLockHeldByMe(...)) inside the hint routine and would:

- Prevent exclusive‐lock contention when sweeping many buffers,  
- Avoid full‐page WAL writes (we only need a hint record, if any)


Would love to hear if this makes sense or or am I overlooking something here. Thanks for any feedback!

Cheers,  
Xuneng

Nazir Bilal Yavuz <byavuz81@gmail.com> 于2025年4月11日周五 17:14写道:
Hi,

There is another thread [1] to add both pg_buffercache_evict_[relation
| all] and pg_buffercache_mark_dirty[_all] functions to the
pg_buffercache. I decided to create another thread as
pg_buffercache_evict_[relation | all] functions are committed but
pg_buffercache_mark_dirty[_all] functions still need review.

pg_buffercache_mark_dirty(): This function takes a buffer id as an
argument and tries to mark this buffer as dirty. Returns true on
success.
pg_buffercache_mark_dirty_all(): This is very similar to the
pg_buffercache_mark_dirty() function. The difference is
pg_buffercache_mark_dirty_all() does not take an argument. Instead it
just loops over the shared buffers and tries to mark all of them as
dirty. It returns the number of buffers marked as dirty.

Since that patch is targeted for the PG 19, pg_buffercache is bumped to v1.7.

Latest version is attached and people who already reviewed the patches are CCed.

[1] postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw%40mail.gmail.com

--
Regards,
Nazir Bilal Yavuz
Microsoft

Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache

От
Nazir Bilal Yavuz
Дата:
Hi,

On Fri, 25 Apr 2025 at 19:17, Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi,
> I’ve been trying to review this patch, and it struck me that we’re currently grabbing the content lock exclusively
justto flip a header bit: 

Thank you for looking into this!

> if (!(buf_state & BM_DIRTY))
> {
>     LWLockAcquire(BufferDescriptorGetContentLock(desc), LW_EXCLUSIVE);
>     MarkBufferDirty(buf);
>     LWLockRelease(BufferDescriptorGetContentLock(desc));
> }
>
> Since our sole goal here is to simulate a buffer’s dirty state for testing/debugging, I wonder—could we instead:

I believe our goal is not only to simulate a buffer’s dirty state but
also replicating steps to mark the buffers as dirty.

> 1. Acquire the shared content lock (which already blocks eviction),
> 2. Call MarkBufferDirtyHint() to flip the BM_DIRTY bit under the header spinlock, and
> 3. Release the shared lock?
>
> This seems to satisfy Assert(LWLockHeldByMe(...)) inside the hint routine and would:
>
> - Prevent exclusive‐lock contention when sweeping many buffers,
> - Avoid full‐page WAL writes (we only need a hint record, if any)
>
>
> Would love to hear if this makes sense or or am I overlooking something here. Thanks for any feedback!

I think what you said makes sense and is correct if we only want to
simulate a buffer’s dirty state for testing/debugging, but if we want
to replicate usual steps to marking buffers as dirty, then I think we
need to have full-page WAL writes.

--
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache

От
Amit Kapila
Дата:
On Mon, Apr 28, 2025 at 2:43 PM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>
> Hi,
>
> On Fri, 25 Apr 2025 at 19:17, Xuneng Zhou <xunengzhou@gmail.com> wrote:
> >
> >
> > Would love to hear if this makes sense or or am I overlooking something here. Thanks for any feedback!
>
> I think what you said makes sense and is correct if we only want to
> simulate a buffer’s dirty state for testing/debugging, but if we want
> to replicate usual steps to marking buffers as dirty, then I think we
> need to have full-page WAL writes.
>

Fair enough. But you haven't mentioned how exactly you want to use
these functions for testing? That will help us to understand whether
we need to replicate all the steps to mark the buffer dirty.

Also, I feel it will be easier for one to test the functionality by
marking buffers dirty for a particular relation rather than by using
buffer_id but maybe I am missing the testing scenarios you have in
mind for the proposed APIs.

The other point to consider was whether we need to lock the relation
for the proposed functions. If we already mark buffers dirty by
scanning the buffer pool in bgwriter/checkpointer without acquiring a
lock on the relation, then why do we need it here?

--
With Regards,
Amit Kapila.



Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache

От
Xuneng Zhou
Дата:
Hey,

I noticed a couple of small clarity issues in the current version of patch for potential clean up:

1. Commit message wording

Right now it says:

“The pg_buffercache_mark_dirty_all() function provides an efficient way to dirty the entire buffer pool (e.g., ~550 ms vs. ~70 ms for 16 GB of shared buffers), complementing pg_buffercache_mark_dirty() for more granular control.”


That makes it sound like the _all function is the granular one, when really:

• pg_buffercache_mark_dirty(buffernumber) is the fine-grained, per-buffer call.

• pg_buffercache_mark_dirty_all() is the bulk, coarse-grained operation.

How about rephrasing to:

“The pg_buffercache_mark_dirty_all() function provides an efficient, bulk way to mark every buffer dirty (e.g., ~70 ms vs. ~550 ms for 16 GB of shared buffers), while pg_buffercache_mark_dirty() still allows per-buffer, granular control.”

2. Inline comment in MarkUnpinnedBufferDirty

We currently have:

PinBuffer_Locked(desc);  /* releases spinlock */

Folks who’re unfamiliar with this function might get confused. Maybe we could use the one in GetVictimBuffer:


/* Pin the buffer and then release its spinlock */

PinBuffer_Locked(buf_hdr);


That spelling-out makes it obvious what’s happening.


Since that patch is targeted for the PG 19, pg_buffercache is bumped to v1.7.

Latest version is attached and people who already reviewed the patches are CCed.

Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache

От
Nazir Bilal Yavuz
Дата:
Hi,

Thank you for looking into this! And sorry for the late reply.

On Mon, 12 May 2025 at 13:24, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 28, 2025 at 2:43 PM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> >
> > Hi,
> >
> > On Fri, 25 Apr 2025 at 19:17, Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > >
> > >
> > > Would love to hear if this makes sense or or am I overlooking something here. Thanks for any feedback!
> >
> > I think what you said makes sense and is correct if we only want to
> > simulate a buffer’s dirty state for testing/debugging, but if we want
> > to replicate usual steps to marking buffers as dirty, then I think we
> > need to have full-page WAL writes.
> >
>
> Fair enough. But you haven't mentioned how exactly you want to use
> these functions for testing? That will help us to understand whether
> we need to replicate all the steps to mark the buffer dirty.

Sorry, you are right. Main idea of this change is to test WAL writes
on CHECKPOINT, I think MarkBufferDirtyHint() is enough for this
purpose but I was thinking about someone else might want to use this
function and I think they expect to replicate all steps to mark the
buffer dirty.

> Also, I feel it will be easier for one to test the functionality by
> marking buffers dirty for a particular relation rather than by using
> buffer_id but maybe I am missing the testing scenarios you have in
> mind for the proposed APIs.

This is done in the v8.

> The other point to consider was whether we need to lock the relation
> for the proposed functions. If we already mark buffers dirty by
> scanning the buffer pool in bgwriter/checkpointer without acquiring a
> lock on the relation, then why do we need it here?

Sorry, I couldn't find this code part. Could you please elaborate more?

I updated patch a bit more, summary is:

- pg_buffercache_mark_dirty_relation() is added in v8.

- Function call is very similar to pg_buffercache_evict*(), we have
MarkDirtyUnpinnedBufferInternal() function inside bufmgr.c file and
this function is called from other functions.

-
pg_buffercache_mark_dirty() returns -> buffer_dirtied and
buffer_already_dirty as a boolean
pg_buffercache_mark_dirty_relation() returns -> buffers_dirtied,
buffers_already_dirty and buffers_skipped
pg_buffercache_mark_dirty_all() returns -> buffers_dirtied,
buffers_already_dirty and buffers_skipped

- Commit message and docs are updated regarding the changes above.

v8 is attached.

--
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

Re: Add pg_buffercache_mark_dirty[_all] functions to the pg_buffercache

От
Nazir Bilal Yavuz
Дата:
Hi,

Thank you for looking into this! And sorry for the late reply.

On Fri, 16 May 2025 at 10:58, Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hey,
>
> I noticed a couple of small clarity issues in the current version of patch for potential clean up:
>
> 1. Commit message wording

I changed the commit message. I made it very similar to the commit
message in dcf7e1697b.

> We currently have:
>
> PinBuffer_Locked(desc);  /* releases spinlock */
>
> Folks who’re unfamiliar with this function might get confused. Maybe we could use the one in GetVictimBuffer:
>
>
> /* Pin the buffer and then release its spinlock */
>
> PinBuffer_Locked(buf_hdr);
>
>
> That spelling-out makes it obvious what’s happening.

I think this makes sense, this is done in v8 which is attached to the
email above.

--
Regards,
Nazir Bilal Yavuz
Microsoft