[PATCHES] Post-special page storage TDE support
От | David Christensen |
---|---|
Тема | [PATCHES] Post-special page storage TDE support |
Дата | |
Msg-id | CAOxo6XKWHHUr1agOZxEHuL-UW8Me3YndUsJ=09tcDiw+Ld8YEw@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: [PATCHES] Post-special page storage TDE support
Re: [PATCHES] Post-special page storage TDE support |
Список | pgsql-hackers |
Hi -hackers, An additional piece that I am working on for improving infra for TDE features is allowing the storage of additional per-page data. Rather than hard-code the idea of a specific struct, this is utilizing a new, more dynamic structure to associate page offsets with a particular feature that may-or-may-not be present for a given cluster. I am calling this generic structure a PageFeature/PageFeatureSet (better names welcome), which is defined for a cluster at initdb/bootstrap time, and reserves a given amount of trailing space on the Page which is then parceled out to the consumers of said space. While the immediate need that this feature fills is storage of encryption tags for XTS-based encryption on the pages themselves, this can also be used for any optional features; as an example I have implemented expanded checksum support (both 32- and 64-bit), as well as a self-description "wasted space" feature, which just allocates trailing space from the page (obviously intended as illustration only). There are 6 commits in this series: 0001 - adds `reserved_page_space` global, making various size calculations and limits dynamic, adjusting access methods to offset special space, and ensuring that we can safely reserve allocated space from the end of pages. 0002 - test suite stability fixes - the change in number of tuples per page means that we had some assumptions about the order from tests that now break 0003 - the "PageFeatures" commit, the meat of this feature (see following description) 0004 - page_checksum32 feature - store the full 32-bit checksum across the existing pd_checksum field as well as 2 bytes from reserved_page_space. This is more of a demo of what could be done here than a practical feature. 0005 - wasted space PageFeature - just use up space. An additional feature we can turn on/off to see how multiple features interact. Only for illustration. 0006 - 64-bit checksums - fully allocated from reserved_page_space. Using an MIT-licensed 64-bit checksum, but if we determined we'd want to do this we'd probably roll our own. From the commit message for PageFeatures: Page features are a standardized way of assigning and using dynamic space usage from the tail end of a disk page. These features are set at cluster init time (so configured via `initdb` and initialized via the bootstrap process) and affect all disk pages. A PageFeatureSet is effectively a bitflag of all configured features, each of which has a fixed size. If not using any PageFeatures, the storage overhead of this is 0. Rather than using a variable location struct, an implementation of a PageFeature is responsible for an offset and a length in the page. The current API returns only a pointer to the page location for the implementation to manage, and no further checks are done to ensure that only the expected memory is accessed. Access to the underlying memory is synonymous with determining whether a given cluster is using an underlying PageFeature, so code paths can do something like: char *loc; if ((loc = ClusterGetPageFeatureOffset(page, PF_MY_FEATURE_ID))) { // ipso facto this feature is enabled in this cluster *and* we know the memory address ... } Since this is direct memory access to the underlying Page, ensure the buffer is pinned. Explicitly locking (assuming you stay in your lane) should only need to guard against access from other backends of this type if using shared buffers, so will be use-case dependent. This does have a runtime overhead due to moving some offset calculations from compile time to runtime. It is thought that the utility of this feature will outweigh the costs here. Candidates for page features include 32-bit or 64-bit checksums, encryption tags, or additional per-page metadata. While we are not currently getting rid of the pd_checksum field, this mechanism could be used to free up that 16 bits for some other purpose. One such purpose might be to mirror the cluster-wise PageFeatureSet, currently also a uint16, which would mean the entirety of this scheme could be reflected in a given page, opening up per-relation or even per-page setting/metadata here. (We'd presumably need to snag a pd_flags bit to interpret pd_checksum that way, but it would be an interesting use.) Discussion is welcome and encouraged! Thanks, David
Вложения
В списке pgsql-hackers по дате отправления: