16: Collation versioning and dependency helpers
От | Jeff Davis |
---|---|
Тема | 16: Collation versioning and dependency helpers |
Дата | |
Msg-id | 51fb77507cafd43fc1a2e733c23045873d93ae60.camel@j-davis.com обсуждение исходный текст |
Ответы |
Re: 16: Collation versioning and dependency helpers
|
Список | pgsql-hackers |
Motivation: We haven't fully solved the changing collation-provider problem. An upgrade of the OS may change the version of libc or icu, and that might affect the collation, which could leave you with various corrupt database objects including: * indexes * constraints * range types or multiranges (or other types dependent on collation for internal consistency) * materialized views * partitioned tables (range or hash) There's discussion about trying to reliably detect these changes and remedy them. But there are major challenges; for instance, glibc doesn't give a reliable signal that a collation may have changed, which would leave us with a lot of false positives and create a new set of problems (e.g. reindexing when it's unnecessary). And even with ICU, we don't have a way to support multiple versions of a provider or of a single collation, so trying to upgrade would still be a hassle. Proposal: Add in some tools to make it easier for administrators to find out if they are at risk and solve the problem for themselves in a systematic way. Patches: 0001: Treat "default" collation as unpinned, so that entries in pg_depend are created. The rationale is that, since the "default" collation can change, it's not really an immutable system object, and it's worth tracking which objects are affected by it. It seems to bloat pg_depend by about 5-10% though -- that doesn't seem great, but I'm not sure if it's a real problem or not. 0002: Enable pg_collation_actual_version() to work on the default collation (oid=100) so that it doesn't need to be treated as a special case. 0003: Fix ALTER COLLATION "default" REFRESH VERSION, which currently throws an unhelpful internal error. Instead, issue a more helpful error that suggests "ALTER DATABASE ... REFRESH COLLATION VERSION" instead. 0004: Add system views: pg_collation_versions: quickly see the current (from the catalog) and actual (from the provider) versions of each collation pg_collation_dependencies: map of objects to the collations they depend on Along with these patches, you can use some tricks to verify data, such as /contrib/amcheck; or fix the data with things like: * REINDEX * VACUUM FULL/TRUNCATE/CLUSTER * REFRESH MATERIALIZED VIEW And then refresh the collation version when you're confident that your data is valid. TODO: * The dependencies view is not rigorously complete, because the directed dependency graph doesn't quite establish an "affected by" relationship. One exception is that a composite type doesn't depend on its associated relation, so a composite type over a range type doesn't depend on the range type. * Consider adding in some verification helpers that can verify that a value is still valid (e.g. a range type that depends on a collation might have corrupt values). We could have a collation verifier for types that are collation-dependenent, or perhaps just go through the input and output functions and catch any errors. * Consider better tracking of which collation versions were active on a particular object since the last REINDEX (or REFRESH MATERIALIZED VIEW, TRUNCATE, or other command that would remove any trace of data affected by the previous collation version). Regards, Jeff Davis
Вложения
В списке pgsql-hackers по дате отправления: