Hi,
On 2025-09-15 19:05:37 -0400, Andres Freund wrote:
> On 2025-08-22 15:44:48 -0400, Andres Freund wrote:
> > The hardest part about this change is that everything kind of depends on each
> > other. The changes are large enough that they clearly can't just be committed
> > at once, but doing them over time risks [temporary] performance regressions.
> >
> >
> >
> >
> > The order of changes I think makes the most sense is the following:
> >
> > 1) Allow some modifications while holding the buffer header spinlock
> >
> > 2) Reduce buffer pin with just an atomic-sub
> >
> > This needs to happen first, otherwise there are performance regressions
> > during the later steps.
>
> Here are the first few cleaned up patches implementing the above steps, as
> well as some cleanups. I included a commit from another thread, as it
> conflicts with these changes, and we really should apply it - and it's
> arguably required to make the changes viable, as it removes one more use of
> PinBuffer_Locked().
>
> Another change included is to not return the buffer with the spinlock held
> from StrategyGetBuffer(), and instead pin the buffer in freelist.c. The reason
> for that is to reduce the most common PinBuffer_locked() call. By definition
> PinBuffer_locked() will become a bit slower due to 0003. But even without 0003
> it 0002 is faster than master. And the previous approach also just seems
> pretty unclean. I don't love that it requires the new TrackNewBufferPin(),
> but I don't really have a better idea.
>
> I invite particular attention to the commit message for 0003 as well as the
> comment changes in buf_internals.h within.
Robert looked at the patches while we were chatting, and I addressed his
feedback in this new version.
Changes:
- Updated patch description for 0002, giving a lot more background
- Improved BufferDesc comments a fair bit more in 0003
- Reduced the size of 0003 a bit, by using UnlockBufHdrExt() instead of a CAS
loop in buffer_stage_common() and reordering some things in
TerminateBufferIO()
- Made 0004 only use the non-looping atomic op in UnpinBufferNoOwner(), not
MarkBufferDirty(). I realized the latter would take additional complexity to
make safe (a CAS loop in TerminateBufferIO()). I am somewhat doubtful that
there are workloads where it matters...
Greetings,
Andres Freund