Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition
От | tender wang |
---|---|
Тема | Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition |
Дата | |
Msg-id | CAHewXN=chu4kBxj=vtCOJJoOCAvipfJzKRuH26BMiyHSDhBk7g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition (tender wang <tndrwang@gmail.com>) |
Список | pgsql-bugs |
I have always been curious why an error is reported only when there is not enough space.
I did some tests and , maybe, I found some answers. My tests as below:
----------------------------
postgres=# CREATE UNLOGGED TABLE filler(a int, b text STORAGE plain);
CREATE TABLE
postgres=# INSERT INTO filler SELECT g, repeat('x', 1000) FROM generate_series(1,50000) g;
INSERT 0 50000
postgres=# CREATE TEMP TABLE tbl(a int);
CREATE TABLE
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
ERROR: could not extend file "base/5/t3_16389": No space left on device
HINT: Check free disk space.
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
ERROR: could not extend file "base/5/t3_16389": No space left on device
HINT: Check free disk space.
postgres=# truncate tbl ;
TRUNCATE TABLE
postgres=# drop table filler ;
DROP TABLE
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
CREATE TABLE
postgres=# INSERT INTO filler SELECT g, repeat('x', 1000) FROM generate_series(1,50000) g;
INSERT 0 50000
postgres=# CREATE TEMP TABLE tbl(a int);
CREATE TABLE
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
ERROR: could not extend file "base/5/t3_16389": No space left on device
HINT: Check free disk space.
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
ERROR: could not extend file "base/5/t3_16389": No space left on device
HINT: Check free disk space.
postgres=# truncate tbl ;
TRUNCATE TABLE
postgres=# drop table filler ;
DROP TABLE
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
postgres=# INSERT INTO tbl SELECT g FROM generate_series(1, 200000) g;
INSERT 0 200000
------------------------
It didn't report an error when I truncated the temp table.
I found buffer's buf_state on local hash table not cleanup when there was no space left on the device.
If I do truncate temp table, DropRelationLocalBuffers() will be called, the buf_state will be clear, then no assert failed issue report.
tender wang <tndrwang@gmail.com> 于2023年12月27日周三 17:22写道:
When I debugged the ExtendBufferedRelLocal(), I found a repeated assignment to existing_hdr.So I fixed this small issue with the previous v2 patch together with the attached v3 patch.tender wang <tndrwang@gmail.com> 于2023年12月27日周三 17:08写道:Alexander Lakhin <exclusion@gmail.com> 于2023年12月27日周三 15:00写道:Hello tender wang,
26.12.2023 19:55, tender wang write:I tried to analyze the issue, and I found that it might be caused by this commit:commit dad50f677c42de207168a3f08982ba23c9fc6720bufmgr: Acquire and clean victim buffer separately
Thanks for looking into it!...With debug logging added in this code within ExtendBufferedRelLocal():
if (found)
{
BufferDesc *existing_hdr =
GetLocalBufferDescriptor(hresult->id);
uint32 buf_state;
UnpinLocalBuffer(BufferDescriptorGetBuffer(victim_buf_hdr));
existing_hdr = GetLocalBufferDescriptor(hresult->id);
PinLocalBuffer(existing_hdr, false);
buffers[i] = BufferDescriptorGetBuffer(existing_hdr);
buf_state = pg_atomic_read_u32(&existing_hdr->state);
Assert(buf_state & BM_TAG_VALID);
Assert(!(buf_state & BM_DIRTY));
buf_state &= BM_VALID;
pg_atomic_unlocked_write_u32(&existing_hdr->state, buf_state);
...
I see that it reached for the second INSERT (and NOSPC error) with
existing_hdr->state == 0x2040000, but for the third INSERT I observe
state == 0x0.
I wonder, if "buf_state &= BM_VALID" is a typo here, maybe it supposed to be
"buf_state &= ~BM_VALID" as in ExtendBufferedRelShared()...Yeah, that's true. I analyze this issue again, and I think the root cause is the " buf_state &= BM_VALID" .In my report issue, buf_state & BM_VALID is true, but buf_state & BM_TAG_VALID is false. This situation is impossible.It can't happen that the data in the local buffer pool is valid, but LocalBufHash has no entry.I modified v1 patch, and attached v2 patch should fix the above issues.Best regards,
Alexander
В списке pgsql-bugs по дате отправления: