Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used
От | Tom Lane |
---|---|
Тема | Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used |
Дата | |
Msg-id | 969017.1689960055@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used (Alexander Lakhin <exclusion@gmail.com>) |
Ответы |
Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used
|
Список | pgsql-bugs |
Alexander Lakhin <exclusion@gmail.com> writes: > I think that we need to determine the level where the problem that should > be fixed is: > 1) test xmlmap fails sporadically due to the catalog changes caused by > parallel tests activity > 2) schema_to_xmlschemaX() can fail when parallel workers are used > 3) has_table_privilegeX() can fail sporadically when executed within a > parallel worker > 4) SearchSysCacheX(RELOID, ...) can switch to a newer catalog snapshot, > when repeated in a parallel worker Yeah, that's not immediately obvious. IIUC, the situation we are looking at is that SearchSysCacheExists can succeed even though the tuple we found is already dead at the instant that the function exits (thanks to absorption of inval messages during relation_close). The fact that that only happens in parallel workers is pure chance really. It is not okay for has_table_privilegeX to depend on the fact that the surrounding query already has some lock on pg_class. So this means that the approach has_table_privilegeX uses of assuming that successful SearchSysCacheExists means it can call pg_class_aclcheck without fear is just broken. If we suppose that that assumption is only being made in the has_foo_privilege functions, then one way we could fix it is to extend the API of pg_class_aclcheck etc to add a no-error-on-not-found flag, and get rid of the separate SearchSysCacheExists check. However, I can't avoid the suspicion that we have other places assuming the same thing. So I think what we really ought to be doing is one of two things: 1. Hack SearchSysCacheExists to account for this issue, by making it loop if it finds a syscache entry but sees that the entry is already dead. (We have to loop, not just return false, in case the row was updated rather than deleted.) Maybe all the syscache lookup functions need to do likewise; it's certainly not intuitively reasonable for them to return already-known-stale entries. 2. Figure out how come we are executing a cache inval on the way out of syscache entry creation, and stop that from happening. I like #2 better if it's not hard to do cleanly. However, I'm not quite sure how we are getting to an inval during relation close; maybe that's not something we want to prevent. regards, tom lane
В списке pgsql-bugs по дате отправления: