Re: [REVIEW] Re: Compression of full-page-writes
От | Michael Paquier |
---|---|
Тема | Re: [REVIEW] Re: Compression of full-page-writes |
Дата | |
Msg-id | CAB7nPqTgC=9wzDpoxecitKwUcnDmNa8epMfGaC=U5Vz7b1ZUvw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [REVIEW] Re: Compression of full-page-writes (Michael Paquier <michael.paquier@gmail.com>) |
Список | pgsql-hackers |
On Mon, Mar 9, 2015 at 9:08 PM, Michael Paquier wrote: > On Mon, Mar 9, 2015 at 4:29 PM, Fujii Masao wrote: >> Thanks for updating the patch! Attached is the refactored version of the patch. Fujii-san and I had a short chat about tuning a bit the PGLZ strategy which is now PGLZ_strategy_default in the patch (at least 25% of compression, etc.). In particular min_input_size which is not set at 32B is too low, and knowing that the minimum fillfactor of a relation page is 10% this looks really too low. For example, using the extension attached to this email able to compress and decompress bytea strings that I have developed after pglz has been moved to libpqcommon (contains as well a function able to get a relation page without its hole, feel free to use it), I am seeing that we can gain quite a lot of space even with some incompressible data like UUID or some random float data (pages are compressed without their hole): 1) Float table: =# create table float_tab (id float); CREATE TABLE =# insert into float_tab select random() from generate_series(1, 20); INSERT 0 20 =# SELECT bytea_size(compress_data(page)) AS compress_size, bytea_size(page) AS raw_size_no_hole FROM get_raw_page('float_tab'::regclass, 0, false); -[ RECORD 1 ]----+---- compress_size | 329 raw_size_no_hole | 744 =# SELECT bytea_size(compress_data(page)) AS compress_size, bytea_size(page) AS raw_size_no_hole FROM get_raw_page('float_tab'::regclass, 0, false); -[ RECORD 1 ]----+----- compress_size | 1753 raw_size_no_hole | 4344 So that's more or less 60% saved... 2) UUID table =# SELECT bytea_size(compress_data(page)) AS compress_size, bytea_size(page) AS raw_size_no_hole FROM get_raw_page('uuid_tab'::regclass, 0, false); -[ RECORD 1 ]----+---- compress_size | 590 raw_size_no_hole | 904 =# insert into uuid_tab select gen_random_uuid() from generate_series(1, 100); INSERT 0 100 =# SELECT bytea_size(compress_data(page)) AS compress_size, bytea_size(page) AS raw_size_no_hole FROM get_raw_page('uuid_tab'::regclass, 0, false); -[ RECORD 1 ]----+----- compress_size | 3338 raw_size_no_hole | 5304 And in this case we are close to 40% saved... At least, knowing that with the header there are at least 24B used on a page, what about increasing min_input_size to something like 128B or 256B? I don't think that this is a blocker for this patch as most of the relation pages are going to have far more data than that so they will be unconditionally compressed, but there is definitely something we could do in this area later on, perhaps even we could do improvement with the other parameters like the compression rate. So that's something to keep in mind... -- Michael
Вложения
В списке pgsql-hackers по дате отправления: