Re: BUG #17949: Adding an index introduces serialisation anomalies.

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: BUG #17949: Adding an index introduces serialisation anomalies.
Дата
Msg-id CA+hUKGKmR57CgZAPRjeSQfXb_oxjSUx7673SqDRw9RF2FfCw=A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #17949: Adding an index introduces serialisation anomalies.  (Dmitry Dolgov <9erthalion6@gmail.com>)
Ответы Re: BUG #17949: Adding an index introduces serialisation anomalies.  (Peter Geoghegan <pg@bowt.ie>)
Re: BUG #17949: Adding an index introduces serialisation anomalies.  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-bugs
On Sun, Jun 25, 2023 at 1:59 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote:
> I've managed to resolve it, or at least reduce the chances for the issue to
> appear, via semi-randomly adding more CheckForSerializableConflictIn /
> PredicateLock around the new sublist that has to be created in
> ginHeapTupleFastInsert. I haven't seen the reproducer failing with this
> changeset after running it multiple times for a couple of minutes, where on the
> main branch, with the two fixes from Thomas included, it was failing within a
> couple of seconds.

Ahh, right, thanks.  I don't think we need to lock all those pages as
you showed, as this whole "fast" path is supposed to be covered by the
meta page (in other words, GIN is expected to have the highest
possible serialisation failure rate under SSI unless you turn fast
updates off).  But there is an ordering bug with the existing
predicate lock in that code, which allows this to happen:

  S1: CheckForSerializableConflictIn(meta)

  S2: PredicateLockPage(meta)
  S2: scan, find no tuples

  S1: BufferLock(EXCLUSIVE
  S1: modify stuff...

CheckForSerializableConflictIn() was written with the assumption that
you're inserting a tuple (ie you have the page containing the tuple
locked), so you'll either conflict with a reader who already has a
predicate lock at that point OR you'll insert first and then the
reader will see your (invisible-to-snapshot) tuples, but here we're
doing some fancy footwork with a meta page, and we screwed up the
ordering and left a window where neither of those things happens.
Perhaps it was coded that way because there is drop-then-reacquire
dance, but it's easy enough to move the check in both branches.  Does
that make sense?

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: BUG #17990: PSQL Process hangs in parallel mode
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #17949: Adding an index introduces serialisation anomalies.