Обсуждение: pg_dump assertion failure with "-n pg_catalog"
On version 15 or later: pg_dump -n pg_catalog postgres > /dev/null pg_dump: pg_dump.c:13291: dumpCollation: Assertion `collcollate != ((void *)0)' failed. Patch attached. I tried adding a test, but dumping pg_catalog seems to only be supported for research purposes, so it's not clear what the test should be testing. If we went almost two release cycles without anyone noticing, then perhaps we should just get rid of the ability to dump pg_catalog. But I think the attached patch clarifies things regardless. -- Jeff Davis PostgreSQL Contributor Team - AWS
Вложения
On Wed, Jun 07, 2023 at 10:36:29AM -0700, Jeff Davis wrote: > If we went almost two release cycles without anyone noticing, then > perhaps we should just get rid of the ability to dump pg_catalog. But I > think the attached patch clarifies things regardless. That looks correct for v15 and HEAD, not when dumping from older versions. Attempting the same command with a patched pg_dump on ~14 triggers more failures. The first assertion hit is this one, for instance: $ pg_dump -n pg_catalog pg_dump: pg_dump.c:13460: dumpCollation: Assertion `colliculocale != ((void *)0)' failed. colliculocale and collicurules don't exist there, but the code is forcing the value to NULL as PQfnumber() returns -1 for an attribute that does not exist. else if (collprovider[0] == 'i') { + Assert(colliculocale != NULL); + Assert(collcollate == NULL); + Assert(collctype == NULL); If ICU provides the collation, colliculocale is NULL when dumping from ~14, while both collcollate and collctype are not NULL. Hence all three assertions are incorrect for ~14, while they are correct for 15~. It seems to me that you are going to require a per-version handling here if you wish to keep collprovider on top of the rest. I'd like to agree with you about the fact that having collprovider handled before the rest makes things easier to follow in the future. At least that's my feeling after looking at your patch. -- Michael
Вложения
On Mon, 2023-06-12 at 15:52 +0900, Michael Paquier wrote: > Attempting the same command with a patched pg_dump on ~14 triggers > more failures. The first assertion hit is this one, for instance: > $ pg_dump -n pg_catalog > pg_dump: pg_dump.c:13460: dumpCollation: Assertion `colliculocale != > ((void *)0)' failed. Thank you, new patch attached. I changed the Assert()s (the old ones as well as the ones I added) into pg_log_warning()s. I don't think it makes sense to assert based on the catalog contents in pg_dump -- other places just warn. Also added a test. > I'd like to agree with you about the fact that having collprovider > handled before the rest makes things easier to follow in the future. > At least that's my feeling after looking at your patch. The new patch does add about 40 net lines, though. Given that it fixes an obscure issue, I'm not even sure if it should be backported -- perhaps it could just go in 17 as more of a cleanup than a bugfix? Even aside from the issue fixed by this patch, there are other warnings emitted when dumping pg_catalog related to types with typtype='p' and typisdefined=t. Regards, Jeff Davis
Вложения
On Thu, Jun 15, 2023 at 03:47:59PM -0700, Jeff Davis wrote: > The new patch does add about 40 net lines, though. I have looked at dumps generated from servers down to 9.2, and these seemed OK. > Given that it fixes an obscure issue, I'm not even sure if it should be > backported -- perhaps it could just go in 17 as more of a cleanup than > a bugfix? Not sure to agree about this point. The first version where this happens is v15, meaning that two branches are impacted currently. So the cost of backpatch is minimal. > Even aside from the issue fixed by this patch, there are other warnings > emitted when dumping pg_catalog related to types with typtype='p' and > typisdefined=t. Right, there are 14 of them, except _record. This is actually much, much older than the collation bits. -- Michael
Вложения
On Fri, 2023-06-16 at 14:29 +0900, Michael Paquier wrote: > Not sure to agree about this point. The first version where this > happens is v15, meaning that two branches are impacted currently. So > the cost of backpatch is minimal. OK, I will plan to commit and backport early next week unless others have additional input. Regards, Jeff Davis
On Fri, 2023-06-16 at 10:29 -0700, Jeff Davis wrote: > OK, I will plan to commit and backport early next week unless others > have additional input. Done. Sorry for the long delay -- lost track of this one. Regards, Jeff Davis
On 16.06.23 00:47, Jeff Davis wrote: > else > { > appendPQExpBufferStr(q, ", lc_collate = "); > - appendStringLiteralAH(q, collcollate, fout); > + appendStringLiteralAH(q, collcollate ? collcollate : "", fout); > appendPQExpBufferStr(q, ", lc_ctype = "); > - appendStringLiteralAH(q, collctype, fout); > + appendStringLiteralAH(q, collctype ? collctype : "", fout); > } > This new code replaces an (unexpected) null value with an empty string. Is that correct? Empty locale strings have specific behaviors under the libc and icu providers. It seems to me it would be better to error out or don't print a CREATE COLLATION command at all rather than substituting a value that is not a correct substitute. Alternatively, a more correct way to produce a null value would be to not issue the assignment at all (omit the " lc_collation = " etc.). Then we can leave it up to the backend to accept the command or not, but at least the dump closely reflects the original catalog state.
On Tue, 2023-09-05 at 11:45 +0200, Peter Eisentraut wrote: > This new code replaces an (unexpected) null value with an empty > string. > Is that correct? Empty locale strings have specific behaviors under > the > libc and icu providers. It seems to me it would be better to error > out > or don't print a CREATE COLLATION command at all rather than > substituting a value that is not a correct substitute. It already issues a warning, which seemed more consistent with other unexpected situations in pg_dump. > Alternatively, a more correct way to produce a null value would be to > not issue the assignment at all (omit the " lc_collation = " etc.). > Then we can leave it up to the backend to accept the command or not, > but > at least the dump closely reflects the original catalog state. The backend does not accept a collation specified with lc_ctype and not lc_collate (or vice versa), so it would be an un-restorable dump. That doesn't seem great. Regards, Jeff Davis