Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer
От | Thomas Munro |
---|---|
Тема | Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer |
Дата | |
Msg-id | CA+hUKGLRfM423eRyCdf+SKaxEmbbeX-h+tQoc4UY4-UbBZT8dA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer (Xuneng Zhou <xunengzhou@gmail.com>) |
Ответы |
Re: BUG #19006: Assert(BufferIsPinned) in BufferGetBlockNumber() is triggered for forwarded buffer
|
Список | pgsql-bugs |
On Wed, Aug 13, 2025 at 5:29 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > 1. The leading block is BM_VALID, so we choose to give it to you > > immediately and not look further (we could look further and return > > more than one consecutive BM_VALID block at a time, but this isn't > > implemented) > > I am curious about why this isn't implemented. It looks helpful. > Is there any blocking issue or trade-off for not doing so? stream->distance tends to be 1 for well cached data (it grows when IO is needed and it is worth looking far ahead, and shrinks when data is found in cache). So we won't currently have a distance > 1 for well cached data, so we wouldn't benefit much. That may not always be true: if the buffer mapping table is changed to a tree data structure (like the way kernels manage virtual memory pages associated with a file or other VM object), then we might have an incentive to look up lots of cached blocks at the same time, and then we might want to change the distance tuning algorithm, and then we might want StartReadBuffers() to be able to return more than one cached block at a time. We've focused mainly on I/O so far, and just tried not to make the well cached cases worse. > The format of this part is not aligned well in gmail, so I copy it into vs code. > Is this layout right? Yes. > I found second illustration somewhat hard to follow, especially > the "do nothing" trick and the movement of next_buffef_index in the second queue. > Maybe I need to read the corresponding code. Suppose you have pending_read_blocknum = 100, pending_read_nblock = 5, next_buffer_index = 200. Now you decide to start that read, because the next block the caller wants can't be combined, so you call StartReadBuffers(blocknum = 100, *nblocks = 4, buffers = &buffers[200]). StartReadBuffers() pins 5 buffers and starts an IO, but finds a reason to split it at size 2. It sets *nblocks to 2, and returns. Now read_stream.c adds 2 to pending_read_blocknum, subtracts 2 from pending_read_nblocks, so that it represents the continuation of the previous read. It also adds 2 to next_buffer_index with modulo arithmetic (so it wraps around the queue), because that is the location of the buffers for the next read, and sets forwarded_buffers to 3, because there are 3 buffers already pinned (in that patch ^ it is renamed to pending_read_npinned). The three buffers themselves are already in the queue at the right position for StartReadBuffers() to receive them and know that it doesn't have to pin them again. What I was referring to with "doing nothing" is the way that if you keep sliding StartReadBuffers() call along the queue, the extra buffers it spits out become its input next time, without having to be copied anywhere. In that picture there are two forwarded buffers, labeled 9 and A.
В списке pgsql-bugs по дате отправления: