Обсуждение: information_schema query not working in PostgreSQL version 17
Dears Sirs,
You will find attached 2 sql files:
- select-for-version-16-or-lower.sql
- select-for-version-17
Both queries return the same result; I've been using the select-for-version-16-or-under query since a few versions ago. Since upgrading to version 17, it doesn't work with a database with many defined constraints. By many, I mean:
- 9104 rows in table_constraints
- 2592 rows in referential_constraints
- 4476 rows in key_column_usage
The original query never completes in version 17; I tried several times and always canceled it after 30 minutes. To make it work in version 17, I had to add redundant search criteria. You can understand what I mean by comparing the attached files. The original query runs in approximately 45 seconds on version 16; the version 17 query runs in approximately 68 seconds, using the same data.
I have never reported an iissue for PostgreSQL before. Ihope this is the proper way to do it.
Regardse,
Jorge Campins
Jorge Campins
Вложения
Jorge Campins <jrcampins@gmail.com> writes: > You will find attached 2 sql files: > - select-for-version-16-or-lower.sql > - select-for-version-17 > Both queries return the same result; I've been using the > select-for-version-16-or-under query since a few versions ago. Since > upgrading to version 17, it doesn't work with a database with many defined > constraints. By many, I mean: I spent some time poking at this. I didn't try to make a database with as many foreign keys as you have, but there's already a pretty noticeable runtime difference for this query in the regression database, with sixty-some foreign keys. The plan that's being chosen is different and a lot worse in 17, as a result of underestimates of the number of rows returned by the joins. It looks like the proximate cause of that is that v16 always estimates the number of rows produced by _pg_expandarray() as 1000, while v17 produces a substantially more accurate estimate (about 1, in my test case) thanks to commits 58054de2d and 9391f7152. In v16, that enormous overestimate accidentally compensates for underestimates elsewhere and keeps it from choosing bad join methods. In v17, not so much. I don't think there are any near-term fixes available on our side. The information_schema views are so messy that they are just hard to plan accurately: there is a substantial impedance mismatch between our catalogs and the SQL model, and that's tough to deal with. One thing that's worth calling out here is that the SQL standard is of the opinion that constraint_schema plus constraint_name is a unique identifier. But in Postgres it is not: in our implementation constraint names are only unique within a table, so that you need schema name plus table name plus constraint name to be sure you are identifying a single constraint. This makes use of information_schema.referential_constraints really quite hazardous, as you can't join it to other views reliably. If you have faithfully adhered to the standard's model and not used conflicting constraint names in different tables, it's fine ... but with no enforcement of that at the system level, do you want to rely on it? Keeping that in mind, I experimented with JOIN information_schema.key_column_usage AS kcu ON kcu.constraint_name = tc.constraint_name + AND kcu.table_name = tc.table_name AND kcu.table_schema = tc.table_schema Of course that only fixes one of the three joins, but interestingly it did improve the plan for me -- I wonder if it does anything for you? Anyway, between the performance issues and this fundamental correctness issue, I think you should forget about using information_schema for this query and instead write it to look directly at the Postgres catalogs. regards, tom lane