Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used
От | Alexander Lakhin |
---|---|
Тема | Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used |
Дата | |
Msg-id | defada9d-8bb1-860f-2682-eee03fdc0ab4@gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used (Alexander Lakhin <exclusion@gmail.com>) |
Ответы |
Re: BUG #18014: Releasing catcache entries makes schema_to_xmlschema() fail when parallel workers are used
|
Список | pgsql-bugs |
13.10.2023 18:00, Alexander Lakhin wrote: > >> I spent some time looking through existing SearchSysCacheExists calls, >> and I could only find two sets of routines where we seem to be >> depending on SearchSysCacheExists to protect a subsequent lookup >> somewhere else, and there isn't any lock on the object in question. >> Those are the has_foo_privilege functions discussed here, and the >> foo_is_visible functions near the bottom of namespace.c. I'm not >> sure why we've not heard complaints traceable to the foo_is_visible >> family. Maybe nobody has tried hard to break them, or maybe they >> are just less likely to be used in ways that are at risk. > > I'll try to research/break xxx_is_visible and share my findings tomorrow. > I tried the script based on the initial reproducer [1]: for ((n=1;n<=30;n++)); do echo "ITERATION $n" numclients=30 for ((c=1;c<=$numclients;c++)); do cat << EOF | psql >psql_$c.log & CREATE SCHEMA testxmlschema_$c; SELECT format('CREATE TABLE testxmlschema_$c.test_%s (a int);', g) FROM generate_series(1, 30) g \\gexec SET parallel_setup_cost = 1; SET min_parallel_table_scan_size = '1kB'; SELECT oid FROM pg_catalog.pg_class WHERE relnamespace = 1 AND relkind IN ('r', 'm', 'v') AND pg_catalog.pg_table_is_visible(oid); SELECT format('DROP TABLE testxmlschema_$c.test_%s', g) FROM generate_series(1, 30) g \\gexec DROP SCHEMA testxmlschema_$c; EOF done wait grep 'ERROR:' server.log && break; done And couldn't get the error, for multiple runs. (Here SELECT oid ... is based on the query executed by schema_to_xmlschema().) But I could reliably get the error with s/pg_table_is_visible(oid)/has_table_privilege (oid, 'SELECT')/. So there is a difference between these two functions. And the difference is in their costs. If I do "ALTER FUNCTION pg_table_is_visible COST 1" before the script, I get the error as expected. With cost 10 I see the following plan: Index Scan using pg_class_relname_nsp_index on pg_class (cost=0.42..2922.38 rows=1 width=4) Index Cond: (relnamespace = '1'::oid) Filter: ((relkind = ANY ('{r,m,v}'::"char"[])) AND pg_table_is_visible(oid)) But with cost 1: Gather (cost=1.00..257.10 rows=1 width=4) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on pg_class (cost=0.00..256.00 rows=1 width=4) Filter: (pg_table_is_visible(oid) AND (relnamespace = '1'::oid) AND (relkind = ANY ('{r,m,v}'::"char"[]))) Rows Removed by Filter: 405 The cost of the pg_foo_is_visible functions was increased in a80889a73. But all the has_xxx_privilige functions have cost 1, except for has_any_column_privilege, which cost was also increased in 7449427a1. So to see the issue we need several ingredients: 1) The mode CATCACHE_FORCE_RELEASE enabled (may be some other way is possible, I don't know of); - Thanks to prion for that. 2) A function with the coding pattern "SearchSysCacheExistsX(); SearchSysCacheX();" called in a parallel worker; - Thanks to "debug_parallel_query = regress" and low cost of has_table_privilege() called by schema_to_xmlschema(). 3) The catalog cache invalidated by some concurrent activity. - Thanks to running the test xmlmap in parallel with 16 other tests. [1] https://www.postgresql.org/message-id/18014-28c81cb79d44295d%40postgresql.org Best regards, Alexander
В списке pgsql-bugs по дате отправления: