Re: jsonb format is pessimal for toast compression
От | Arthur Silva |
---|---|
Тема | Re: jsonb format is pessimal for toast compression |
Дата | |
Msg-id | CAO_YK0VzGrtndbQiUL67-aRxD0t2-YsC9KREgZmJCOJAz6AKhg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: jsonb format is pessimal for toast compression (Josh Berkus <josh@agliodbs.com>) |
Ответы |
Re: jsonb format is pessimal for toast compression
|
Список | pgsql-hackers |
On Thu, Sep 11, 2014 at 10:01 PM, Josh Berkus <josh@agliodbs.com> wrote:
So, I finally got time to test Tom's latest patch on this.
TLDR: we want to go with Tom's latest patch and release beta3.
Figures:
So I tested HEAD against the latest lengths patch. Per Arthur Silva, I
checked uncompressed times for JSONB against compressed times. This
changed the picture considerably.
TABLE SIZES
-----------
HEAD
?column? | pg_size_pretty
---------------------+----------------
json text format | 393 MB
jsonb: compressed | 1147 MB
jsonb: uncompressed | 1221 MB
PATCHED
?column? | pg_size_pretty
---------------------+----------------
json text format | 394 MB
jsonb: compressed | 525 MB
jsonb: uncompressed | 1200 MB
EXTRACTION TIMES
----------------
HEAD
Q1 (search via GIN index followed by extracting 100,000 values from rows):
jsonb compressed: 4000
jsonb uncompressed: 3250
Q2 (seq scan and extract 200,000 values from rows):
json: 11700
jsonb compressed: 3150
jsonb uncompressed: 2700
PATCHED
Q1:
jsonb compressed: 6750
jsonb uncompressed: 3350
Q2:
json: 11796
jsonb compressed: 4700
jsonb uncompressed: 2650
----------------------
Conclusion: with Tom's patch, compressed JSONB is 55% smaller when
compressed (EXTENDED). Extraction times are 50% to 70% slower, but this
appears to be almost entirely due to decompression overhead. When not
compressing (EXTERNAL), extraction times for patch versions are
statistically the same as HEAD, and file sizes are similar to HEAD.
USER REACTION
-------------
I polled at both PDXpgDay and at FOSS4G, asking some ~~ 80 Postgres
users how they would feel about a compression vs. extraction time
tradeoff. The audience was evenly split.
However, with the current patch, the user can choose. Users who know
enough for performance tuning can set JSONB columns to EXTERNAL, and the
the same performance as the unpatched version.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
The compression ratio difference is exaggerated in this case but it does support that Tom's patch alleviates the extraction penalty.
In my testings with the github archive data the savings <-> performance-penalty was fine, but I'm not confident in those results since there were only 8 top level keys.
For comparison, some twitter api objects[1] have 30+ top level keys. If I have time in the next couple of days I'll conduct some testings with the public twitter fire-hose data.
[1] https://dev.twitter.com/rest/reference/get/statuses/home_timeline
В списке pgsql-hackers по дате отправления: