Обсуждение: Circular-freelist bug is still there
I just saw the parallel regression tests hang up again. Inspection
revealed that StrategyInvalidateBuffer() was stuck in an infinite loop
because the freelist was circular.
(gdb) p StrategyControl->listFreeBuffers
$5 = 579
(gdb) p BufferDescriptors[579]
$6 = {bufNext = 106, data = 4991904, tag = {rnode = {tblNode = 17142, relNode = 143947}, blockNum = 0}, buf_id =
579,flags = 14, refcount = 0, io_in_progress_lock = 1179, cntx_lock = 1180, cntxDirty = 0 '\000', wait_backend_id = 0}
(gdb) p BufferDescriptors[106]
$7 = {bufNext = 684, data = 1117088, tag = {rnode = {tblNode = 17142, relNode = 143989}, blockNum = 0}, buf_id =
106,flags = 14, refcount = 0, io_in_progress_lock = 233, cntx_lock = 234, cntxDirty = 0 '\000', wait_backend_id = 0}
(gdb) p BufferDescriptors[684]
$8 = {bufNext = 579, data = 5852064, tag = {rnode = {tblNode = 17142, relNode = 143929}, blockNum = 0}, buf_id =
684,flags = 14, refcount = 0, io_in_progress_lock = 1389, cntx_lock = 1390, cntxDirty = 0 '\000', wait_backend_id = 0}
(gdb)
Don't have time to chase it right now, but you should know that there's
still a low-probability bug in there.
regards, tom lane
Tom Lane wrote:
> I just saw the parallel regression tests hang up again. Inspection
> revealed that StrategyInvalidateBuffer() was stuck in an infinite loop
> because the freelist was circular.
>
> (gdb) p StrategyControl->listFreeBuffers
> $5 = 579
> (gdb) p BufferDescriptors[579]
> $6 = {bufNext = 106, data = 4991904, tag = {rnode = {tblNode = 17142,
> relNode = 143947}, blockNum = 0}, buf_id = 579, flags = 14,
> refcount = 0, io_in_progress_lock = 1179, cntx_lock = 1180,
> cntxDirty = 0 '\000', wait_backend_id = 0}
> (gdb) p BufferDescriptors[106]
> $7 = {bufNext = 684, data = 1117088, tag = {rnode = {tblNode = 17142,
> relNode = 143989}, blockNum = 0}, buf_id = 106, flags = 14,
> refcount = 0, io_in_progress_lock = 233, cntx_lock = 234,
> cntxDirty = 0 '\000', wait_backend_id = 0}
> (gdb) p BufferDescriptors[684]
> $8 = {bufNext = 579, data = 5852064, tag = {rnode = {tblNode = 17142,
> relNode = 143929}, blockNum = 0}, buf_id = 684, flags = 14,
> refcount = 0, io_in_progress_lock = 1389, cntx_lock = 1390,
> cntxDirty = 0 '\000', wait_backend_id = 0}
> (gdb)
>
> Don't have time to chase it right now, but you should know that there's
> still a low-probability bug in there.
I was under the assumption Neil was still working on this. Don't recall
exactly why.
Anyhow, according to our discussion in early January I have changed the
code in StrategyInvalidateBuffer() so that it clears out the buffer tag
and the CDB's buffer tag. Also it will error out if the CDB is not found
at all.
The BM_FREE flag (meaning BM_UNPINNED effectively) is gone and replaced
with direct checks against the refcount.
Thanks for reminding,
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
> Oh, okay. So when's that fix going to be committed?
Never mind, I see you just did ...
regards, tom lane
Jan Wieck <JanWieck@Yahoo.com> writes:
> Tom Lane wrote:
>> I just saw the parallel regression tests hang up again.
> Anyhow, according to our discussion in early January I have changed the
> code in StrategyInvalidateBuffer() so that it clears out the buffer tag
> and the CDB's buffer tag. Also it will error out if the CDB is not found
> at all.
Oh, okay. So when's that fix going to be committed?
regards, tom lane