Re: Add ZSON extension to /contrib/
От | Tomas Vondra |
---|---|
Тема | Re: Add ZSON extension to /contrib/ |
Дата | |
Msg-id | b53392fa-47e3-a2e6-a8e1-6329d1d74da6@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Add ZSON extension to /contrib/ (Andrew Dunstan <andrew@dunslane.net>) |
Список | pgsql-hackers |
On 5/27/21 4:15 AM, Andrew Dunstan wrote: > > On 5/26/21 5:29 PM, Bruce Momjian wrote: >> On Tue, May 25, 2021 at 01:55:13PM +0300, Aleksander Alekseev wrote: >>> Hi hackers, >>> >>> Back in 2016 while being at PostgresPro I developed the ZSON extension [1]. The >>> extension introduces the new ZSON type, which is 100% compatible with JSONB but >>> uses a shared dictionary of strings most frequently used in given JSONB >>> documents for compression. These strings are replaced with integer IDs. >>> Afterward, PGLZ (and now LZ4) applies if the document is large enough by common >>> PostgreSQL logic. Under certain conditions (many large documents), this saves >>> disk space, memory and increases the overall performance. More details can be >>> found in README on GitHub. >> I think this is interesting because it is one of the few cases that >> allow compression outside of a single column. Here is a list of >> compression options: >> >> https://momjian.us/main/blogs/pgblog/2020.html#April_27_2020 >> >> 1. single field >> 2. across rows in a single page >> 3. across rows in a single column >> 4. across all columns and rows in a table >> 5. across tables in a database >> 6. across databases >> >> While standard Postgres does #1, ZSON allows 2-5, assuming the data is >> in the ZSON data type. I think this cross-field compression has great >> potential for cases where the data is not relational, or hasn't had time >> to be structured relationally. It also opens questions of how to do >> this cleanly in a relational system. >> > > I think we're going to get the best bang for the buck on doing 2, 3, and > 4. If it's confined to a single table then we can put a dictionary in > something like a fork. Agreed. > Maybe given partitioning we want to be able to do multi-table > dictionaries, but that's less certain. > Yeah. I think it'll have many of the same issues/complexity as global indexes, and the gains are likely limited. At least assuming the partitions are sufficiently large, but tiny partitions are inefficient in general, I think. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: