regproc's lack of certainty is dangerous
От | Tom Lane |
---|---|
Тема | regproc's lack of certainty is dangerous |
Дата | |
Msg-id | 28808.1047500189@sss.pgh.pa.us обсуждение исходный текст |
Ответы |
Re: regproc's lack of certainty is dangerous
|
Список | pgsql-hackers |
Deepak Bhole of Red Hat just pointed out to me a failure that he got after some 7.3 stress testing: > [ "`~!@#$%^&*()''| \final_db\n,.;:'" ]=# SELECT n.nspname, p.proname, > o.oprcode::oid FROM pg_operator o, pg_proc p, pg_namespace n WHERE > o.oid=270447::oid AND p.oid=o.oprcode::oid AND p.pronamespace=n.oid; > ERROR: There is more than one procedure named "[ ""`~!@#$%^&*()''| > \final_schema\n,.;:'"" ]"."[ ""`~!@#$%^&*''| \ {func_for_op}\n,.;:'"" ]" This error comes out of regprocin() when it finds multiple candidate functions with the same name (and, presumably, different argument lists or different schemas). No big surprise, since that's what he had. But it's a bit odd that regprocin() is being invoked, when there's no regproc literal in the given query. After some digging, it turns out that the error is appearing because that function name is present in the pg_statistic entry for pg_operator.oprcode. *Any* query that causes the optimizer to become interested in pg_operator.oprcode will fail under these circumstances :-(. And the user can't readily avoid this, since there's no way to be sure which function names will happen to end up in the histogram. "Never ANALYZE the table" isn't going to fly as a workaround. I am not real sure what we should do about it. Clearly there is more risk than I'd realized in datatypes whose input routines may reject strings that their output routines had produced in good faith. One possible route is to try to eliminate the ambiguity, but I doubt that that will work very effectively for regproc and friends --- the whole point of those types is to resolve ambiguous input, and so the possibility of failures in the input routine can't easily be removed. regproc in particular needs its special behavior to be useful for bootstrapping. Another approach is to try to fix pg_statistic to avoid the problem by not doing I/O conversions. For scalar datatypes (those that have associated array types) it'd be probably be feasible to store the histogram and most-common-value arrays as arrays of the datatype itself, not arrays of text; that should be a win for performance as well as avoiding risky conversions. I am not sure what to do about columns that have datatypes without matching array types, though (that would include array columns and domains, IIRC). Maybe use array of bytea to hold the internal representation of the type? Any comments or better ideas out there? regards, tom lane
В списке pgsql-hackers по дате отправления: