Re: alternative compression algorithms?
От | Tomas Vondra |
---|---|
Тема | Re: alternative compression algorithms? |
Дата | |
Msg-id | 5541816A.1000303@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: alternative compression algorithms? (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: alternative compression algorithms?
|
Список | pgsql-hackers |
On 04/30/15 02:42, Robert Haas wrote: > On Wed, Apr 29, 2015 at 6:55 PM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> I'm not convinced not compressing the data is a good idea - it suspect it >> would only move the time to TOAST, increase memory pressure (in general and >> in shared buffers). But I think that using a more efficient compression >> algorithm would help a lot. >> >> For example, when profiling the multivariate stats patch (with multiple >> quite large histograms), the pglz_decompress is #1 in the profile, occupying >> more than 30% of the time. After replacing it with the lz4, the data are bit >> larger, but it drops to ~0.25% in the profile and planning the drops >> proportionally. > > That seems to imply a >100x improvement in decompression speed. Really??? Sorry, that was a bit misleading over-statement. The profiles (same dataset, same workload) look like this: pglz_decompress --------------- 44.51% postgres [.] pglz_decompress 13.60% postgres [.] update_match_bitmap_histogram 8.40% postgres [.] float8_cmp_internal 7.43% postgres [.] float8lt 6.49% postgres [.] deserialize_mv_histogram 6.23% postgres [.] FunctionCall2Coll 4.06% postgres [.] DatumGetFloat8 3.48% libc-2.18.so [.] __isnan 1.26% postgres [.] clauselist_mv_selectivity 1.09% libc-2.18.so [.] __memcpy_sse2_unaligned lz4 --- 18.05% postgres [.] update_match_bitmap_histogram 11.67% postgres [.] float8_cmp_internal 10.53% postgres [.] float8lt 8.67% postgres [.] FunctionCall2Coll 8.52% postgres [.] deserialize_mv_histogram 5.52% postgres [.] DatumGetFloat8 4.90% libc-2.18.so [.] __isnan 3.92% liblz4.so.1.6.0 [.] 0x0000000000002603 2.08% liblz4.so.1.6.0 [.] 0x0000000000002847 1.81% postgres [.]clauselist_mv_selectivity 1.47% libc-2.18.so [.] __memcpy_sse2_unaligned 1.33% liblz4.so.1.6.0 [.] 0x000000000000260f 1.16% liblz4.so.1.6.0 [.] 0x00000000000025e3 (and then a long tail of other lz4 calls) The difference used to more significant, but I've done a lot of improvements in the update_match_bitmap method (so the lz4 methods are more significant). The whole script (doing a lot of estimates) takes 1:50 with pglz and only 1:25 with lz4. That's ~25-30% improvement. The results are slightly unreliable because collected in a Xen VM, and the overhead is non-negligible (but the same in both cases). I wouldn't be surprised if the difference was more significant without the VM. -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: