Fwd: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition
От | tender wang |
---|---|
Тема | Fwd: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition |
Дата | |
Msg-id | CAHewXNnUZHVZUggkO_0yp95GQaOUr57WCf_zar6-PxK5+6tghA@mail.gmail.com обсуждение исходный текст |
Ответ на | BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition (PG Bug reporting form <noreply@postgresql.org>) |
Список | pgsql-bugs |
Sorry, I forgot to cc other peoples emails. I only sent to Andres.
---------- Forwarded message ---------
发件人: tender wang <tndrwang@gmail.com>
Date: 2024年1月5日周五 11:49
Subject: Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition
To: Andres Freund <andres@anarazel.de>
发件人: tender wang <tndrwang@gmail.com>
Date: 2024年1月5日周五 11:49
Subject: Re: BUG #18259: Assertion in ExtendBufferedRelLocal() fails after no-space-left condition
To: Andres Freund <andres@anarazel.de>
Andres Freund <andres@anarazel.de> 于2024年1月5日周五 02:57写道:
Hi,
On 2023-12-28 14:36:46 +0800, Richard Guo wrote:
> On Wed, Dec 27, 2023 at 5:08 PM tender wang <tndrwang@gmail.com> wrote:
>
> > Alexander Lakhin <exclusion@gmail.com> 于2023年12月27日周三 15:00写道:
> >
> >> I wonder, if "buf_state &= BM_VALID" is a typo here, maybe it supposed to
> >> be
> >> "buf_state &= ~BM_VALID" as in ExtendBufferedRelShared()...
> >>
> >
> > Yeah, that's true. I analyze this issue again, and I think the root cause
> > is the " buf_state &= BM_VALID" .
> >
>
> Nice catch.
Indeed.
I wonder how this survived - I remember having written a script to locally
test this scenario...
Hm, I dimly remember having a hard time making my approach for this work for
temp tables - IIRC used chattr +i to makr the file immutable, which only works
for file descriptors opened after.
> I believe the intention is to clear the BM_VALID bit.
Indeed.
These paths had many bugs over the years, they are hard to hit in testing but
regularly hit in production. In a more modern world we'd have a unit test for
these scenarios, in isolation. But that's hard at the moment.
I have been wondering if it's worth and possible to write a C level test for
them, perhaps in regress.c. But right now that seems hard because of the
DropRelationBuffers() in smgrtruncate() (seems like a layering violation to
have it there, but it's probably too risky to move at this point).
But perhaps this could actually serve as a good first case for Michael's
failure injection patch? A failure injection to make
FileFallocate()/FileZero() fail should just be a few lines.
Yeah, agree! This is a great opportunity to introduce advanced testing techniques such as Michael's fault injection.
We generally have pretty much no coverage for out-of-space, partial
read/write, EINTR of file operations, because that's hard to do with the
current test infrastructure. So this might be a nice first case.
AFAIK for out-of-space cases:
1、Checkpointer process and WalWriter process would exit, and the user's session will be disconnected
2、restart instance will fail if it has redo logs to replay. This will influence availability.
1 and 2 are not user-friendly.
Greetings,
Andres Freund
В списке pgsql-bugs по дате отправления: