Re: [PATCH] Compression dictionaries for JSONB
От | Aleksander Alekseev |
---|---|
Тема | Re: [PATCH] Compression dictionaries for JSONB |
Дата | |
Msg-id | CAJ7c6TPSN06C+5cYSkyLkQbwN1C+pUNGmx+VoGCA-SPLCszC8w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Compression dictionaries for JSONB (Nikita Malakhov <hukutoc@gmail.com>) |
Список | pgsql-hackers |
Hi hackers, I would like to continue discussing compression dictionaries. > So I summarized the requirements we agreed on so far and ended up with > the following list: [...] Again, here is the summary of our current agreements, at least how I understand them. Please feel free to correct me where I'm wrong. We are going to focus on supporting the: ```` SET COMPRESSION lz4 [WITH|WITHOUT] DICTIONARY ``` ... syntax for now. From the UI perspective the rest of the agreements didn't change compared to the previous summary. In the [1] discussion (cc: Robert) we agreed to use va_tag != 18 for the on-disk TOAST pointer representation to make TOAST pointers extendable. If va_tag has a different value (currently it's always 18), the TOAST pointer is followed by an utf8-like varint bitmask. This bitmask determines the rest of the content of the TOAST pointer and its overall size. This will allow to extend TOAST pointers to include dictionary_id and also to extend them in the future, e.g. to support ZSTD and other compression algorithms, use 64-bit TOAST pointers, etc. Several things occured to me: - Does anyone believe that va_tag should be part of the utf8-like bitmask in order to save a byte or two? - The described approach means that compression dictionaries are not going to be used when data is compressed in-place (i.e. within a tuple), since no TOAST pointer is involved in this case. Also we will be unable to add additional compression algorithms here. Does anyone have problems with this? Should we use the reserved compression algorithm id instead as a marker of an extended TOAST? - It would be nice to decompose the feature in several independent patches, e.g. modify TOAST first, then add compression dictionaries without automatic update of the dictionaries, then add the automatic update. I find it difficult to imagine however how to modify TOAST pointers and test the code properly without a dependency on a larger feature. Could anyone think of a trivial test case for extendable TOAST? Maybe something we could add to src/test/modules similarly to how we test SLRU, background workers, etc. [1]: https://www.postgresql.org/message-id/flat/CAN-LCVMq2X%3Dfhx7KLxfeDyb3P%2BBXuCkHC0g%3D9GF%2BJD4izfVa0Q%40mail.gmail.com -- Best regards, Aleksander Alekseev
В списке pgsql-hackers по дате отправления: