Re: cannot abort transaction 2737414167, it was already committed

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: cannot abort transaction 2737414167, it was already committed
Дата
Msg-id CA+hUKGJ5_ZhVOjL8NoTBLv3+fM8EuwHW1TBJw8rfe08vA=5rnw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: cannot abort transaction 2737414167, it was already committed  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Thu, Dec 28, 2023 at 11:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > In CommitTransaction() there is a stretch of code beginning s->state =
> > TRANS_COMMIT and ending s->state = TRANS_DEFAULT, from which we call
> > out to various subsystems' AtEOXact_XXX() functions.  There is no way
> > to roll back in that state, so anything that throws ERROR from those
> > routines is going to get something much like $SUBJECT.  Hmm, we'd know
> > which exact code path got that EIO from your smoldering core if we'd
> > put an explicit critical section there (if we're going to PANIC
> > anyway, it might as well not be from a different stack after
> > longjmp()...).
>
> +1, there's basically no hope of debugging this sort of problem
> as things stand.

I was reminded of this thread by Justin's other file system snafu thread.

Naively defining a critical section to match the extent of the
TRANS_COMMIT state doesn't work, as a bunch of code under there uses
palloc().  That reminds me of the nearby RelationTruncate() thread,
and there is possibly even some overlap, plus more in this case...
ugh.

Hmm, AtEOXact_RelationMap() is one of those steps, but lives just
outside the crypto-critical-section created by TRANS_COMMIT, though
has its own normal CS for logging.  I wonder, given that "updating the
map file is effectively commit of the relocation", why wouldn't it
have a variant of the problem solved by DELAY_CHKPT_START for normal
commit records, under diabolical scheduling?  It's a stretch, but: You
log XLOG_RELMAP_UPDATE, a concurrent checkpoint runs with REDO after
that record, you crash before/during durable_rename(), and then you
perform crash recovery.  Now your catalog is still using the old
relfilenode on the primary, but any replica following along replays
XLOG_RELMAP_UPDATE and is using the new relfilenode, frozen in time,
for queries, while replaying changes to the old relfilenode.  Right?



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: First draft of PG 17 release notes
Следующее
От: Andrei Lepikhov
Дата:
Сообщение: Re: query_id, pg_stat_activity, extended query protocol