Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used
Дата
Msg-id 969017.1689960055@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used  (Alexander Lakhin <exclusion@gmail.com>)
Ответы Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used  (Alexander Lakhin <exclusion@gmail.com>)
Список pgsql-bugs
Alexander Lakhin <exclusion@gmail.com> writes:
> I think that we need to determine the level where the problem that should
> be fixed is:
> 1) test xmlmap fails sporadically due to the catalog changes caused by
>   parallel tests activity
> 2) schema_to_xmlschemaX() can fail when parallel workers are used
> 3) has_table_privilegeX() can fail sporadically when executed within a
>   parallel worker
> 4) SearchSysCacheX(RELOID, ...) can switch to a newer catalog snapshot,
>   when repeated in a parallel worker

Yeah, that's not immediately obvious.  IIUC, the situation we are
looking at is that SearchSysCacheExists can succeed even though the
tuple we found is already dead at the instant that the function
exits (thanks to absorption of inval messages during relation_close).
The fact that that only happens in parallel workers is pure chance
really.  It is not okay for has_table_privilegeX to depend on the
fact that the surrounding query already has some lock on pg_class.
So this means that the approach has_table_privilegeX uses of
assuming that successful SearchSysCacheExists means it can call
pg_class_aclcheck without fear is just broken.

If we suppose that that assumption is only being made in the
has_foo_privilege functions, then one way we could fix it is to extend
the API of pg_class_aclcheck etc to add a no-error-on-not-found flag,
and get rid of the separate SearchSysCacheExists check.  However,
I can't avoid the suspicion that we have other places assuming the
same thing.  So I think what we really ought to be doing is one
of two things:

1. Hack SearchSysCacheExists to account for this issue, by making it
loop if it finds a syscache entry but sees that the entry is already
dead.  (We have to loop, not just return false, in case the row was
updated rather than deleted.)  Maybe all the syscache lookup
functions need to do likewise; it's certainly not intuitively
reasonable for them to return already-known-stale entries.

2. Figure out how come we are executing a cache inval on the way
out of syscache entry creation, and stop that from happening.

I like #2 better if it's not hard to do cleanly.  However, I'm not
quite sure how we are getting to an inval during relation close;
maybe that's not something we want to prevent.

            regards, tom lane



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Alexander Lakhin
Дата:
Сообщение: Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used
Следующее
От: Alexander Lakhin
Дата:
Сообщение: Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used