Re: Fix to enum hashing for dump and restore
От | Andrew |
---|---|
Тема | Re: Fix to enum hashing for dump and restore |
Дата | |
Msg-id | f80347af-12d0-40d3-b8e6-65eb79e6a5fb@app.fastmail.com обсуждение исходный текст |
Ответ на | Re: Fix to enum hashing for dump and restore (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
Those are excellent points.
We will investigate adjusting pg_dump behavior,
as this is primarily a dump+restore issue.
Thank you!
-Andrew J Repp (VMware)
On Tue, Jan 24, 2023, at 9:56 PM, Tom Lane wrote:
Andrew <pgsqlhackers@andrewrepp.com> writes:> I have discovered a bug in one usage of enums. If a table with hash> partitions uses an enum as a partitioning key, it can no longer be> backed up and restored correctly. This is because enums are represented> simply as oids, and the hash function for enums hashes that oid to> determine partition distribution. Given the way oids are assigned, any> dump+restore of a database with such a table may fail with the error> "ERROR: new row for relation "TABLENAME" violates partition constraint".Ugh, that was not well thought out :-(. I suppose this isn't a problemfor pg_upgrade, which should preserve the enum value OIDs, but anordinary dump/restore will indeed hit this.> I have written a patch to fix this bug (attached), by instead having the> hashenum functions look up the enumsortorder ID of the value being> hashed. These are deterministic across databases, and so allow for> stable dump and restore.Unfortunately, I'm not sure those are as deterministic as all that.They are floats, so there's a question of roundoff error, not tomention cross-platform variations in what a float looks like. (At theabsolute minimum, I think we'd have to change your patch to forceconsistent byte ordering of the floats.) Actually though, roundofferror wouldn't be a problem for the normal exact-integer values ofenumsortorder. Where it could come into play is with the fractionalvalues used after you insert a value into the existing sort order.And then the whole idea fails, because a dump/restore won't duplicatethose fractional values.Another problem with this approach is that we can't get from here to therewithout a guaranteed dump/reload failure, since it's quite unlikely thatthe partition assignment will be the same when based on enumsortorderas it was when based on OIDs. Worse, it also breaks the pg_upgrade case.I wonder if it'd work to make pg_dump force --load-via-partition-rootmode when a hashed partition key includes an enum.regards, tom lane
В списке pgsql-hackers по дате отправления: