Re: Fix race condition in InvalidatePossiblyObsoleteSlot()

Поиск
Список
Период
Сортировка
От Bharath Rupireddy
Тема Re: Fix race condition in InvalidatePossiblyObsoleteSlot()
Дата
Msg-id CALj2ACUMPTvuk0_GuPiFbUb=PMTQq0WMi5Kvj8Mz0JZ_ico1Lw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Fix race condition in InvalidatePossiblyObsoleteSlot()  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: Fix race condition in InvalidatePossiblyObsoleteSlot()  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Список pgsql-hackers
On Mon, Feb 19, 2024 at 11:44 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> > Yeah, comments added in v3.
>
> The contents look rather OK, I may do some word-smithing for both.

Here are some comments on v3:

1.
+    XLogRecPtr    initial_effective_xmin = InvalidXLogRecPtr;
+    XLogRecPtr    initial_catalog_effective_xmin = InvalidXLogRecPtr;
+    XLogRecPtr    initial_restart_lsn = InvalidXLogRecPtr;

Prefix 'initial_' makes the variable names a bit longer, I think we
can just use effective_xmin, catalog_effective_xmin and restart_lsn,
the code updating then only when if (!terminated) tells one that they
aren't updated every time.

2.
+            /*
+             * We'll release the slot's mutex soon, so it's possible that
+             * those values change since the process holding the slot has been
+             * terminated (if any), so record them here to ensure we would
+             * report the slot as obsolete correctly.
+             */

This needs a bit more info as to why and how effective_xmin,
catalog_effective_xmin and restart_lsn can move ahead after signaling
a backend and before the signalled backend reports back.

3.
+        /*
+         * Assert that the conflict cause recorded previously before we
+         * terminate the process did not change now for any reason.
+         */
+        Assert(!(conflict_prev != RS_INVAL_NONE && terminated &&
+                 conflict_prev != conflict));

It took a while for me to understand the above condition, can we
simplify it like below using De Morgan's laws for better readability?

Assert((conflict_prev == RS_INVAL_NONE) || !terminated ||
(conflict_prev == conflict));

> > Are we going to fix this on master and 16 stable first and then later on add a
> > test on master with the injection points?
>
> Still, the other patch is likely going to take a couple of weeks
> before getting into the tree, so I have no objection to fix the bug
> first and backpatch, then introduce a test.  Things have proved that
> failures could show up in the buildfarm in this area, so more time
> running this code across two branches is not a bad concept, either.

While I couldn't agree more on getting this fix in, it's worth pulling
in the required injection points patch here and writing the test to
reproduce this race issue.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Quan Zongliang
Дата:
Сообщение: Re: Improvement discussion of custom and generic plans
Следующее
От: Bertrand Drouvot
Дата:
Сообщение: Re: System username in pg_stat_activity