Re: [HACKERS] Hash Functions
От | Robert Haas |
---|---|
Тема | Re: [HACKERS] Hash Functions |
Дата | |
Msg-id | CA+TgmoZ-WCDxkjiHfzKv4Fa18NK-M2d7g1-1pGK2EKMTqGJ1GA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] Hash Functions (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: [HACKERS] Hash Functions
(Jeff Davis <pgsql@j-davis.com>)
|
Список | pgsql-hackers |
On Tue, May 16, 2017 at 11:10 AM, Jeff Davis <pgsql@j-davis.com> wrote: > With hash partitioning: > * User only specifies number of partitions of the parent table; does > not specify individual partition properties (modulus, etc.) > * Dump/reload goes through the parent table (though we may provide > options so pg_dump/restore can optimize this) > * We could provide syntax to adjust the number of partitions, which > would be expensive but still useful sometimes. > * All DDL should be on the parent table, including check constraints, > FKs, unique constraints, exclusion constraints, indexes, etc. > - Unique and exclusion constraints would only be permitted if the > keys are a superset of the partition keys. > - FKs would only be permitted if the two table's partition schemes > match and the keys are members of the same hash opfamily (this could > be relaxed slightly, but it gets a little confusing if so) > * No attach/detach of partitions > * All partitions have the same permissions > * Individual partitions would only be individually-addressable for > maintenance (like reindex and vacuum), but not for arbitrary queries > - perhaps also COPY for bulk loading/dumping, in case we get clients > smart enough to do their own hashing. I don't really find this a very practical design. If the table partitions are spread across different relfilenodes, then those relfilenodes have to have separate pg_class entries and separate indexes, and those indexes also need to have separate pg_class entries. Otherwise, nothing works. And if they do have separate pg_class entries, then the partitions have to have their own names, and likewise for their indexes, and a dump-and-reload has to preserve those names. If it doesn't, and those objects get new system-assigned names after the dump-and-reload, then dump restoration can fail when a system-assigned name collides with an existing name that is first mentioned later in the dump. If we had the ability to have anonymous pg_class entries -- relations that have no names -- then maybe it would be possible to make something like what you're talking about work. But that does not seem easy to do. There's a unique index on (relname, relnamespace) for good reason, and we can't make it partial on a system catalog. We could make the relname column allow nulls, but that would add overhead to any code that needs to access the relation name, and there's a fair amount of that. Similarly, if we had the ability to associate multiple relfilenodes with a single relation, and if index entries could point to <which-relfilenode, block, offset> rather than just <block, offset>, then we could also make this work. But either of those things would require significant re-engineering and would have downsides in other cases. If Java has portable hash functions, why can't we? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Robert HaasДата:
Сообщение: Re: [HACKERS] Race conditions with WAL sender PID lookups
Следующее
От: Robert HaasДата:
Сообщение: Re: [HACKERS] Adding support for Default partition in partitioning