[PATCH] Compression and on-disk sorting
От | Martijn van Oosterhout |
---|---|
Тема | [PATCH] Compression and on-disk sorting |
Дата | |
Msg-id | 20060517161730.GI15180@svana.org обсуждение исходный текст |
Ответы |
Re: [PATCH] Compression and on-disk sorting
|
Список | pgsql-patches |
Persuant to the discussions currently on -hackers, here's a patch that uses zlib to compress the tapes as they go to disk. I default to the compression level 3 (think gzip -3). Please speed test all you like, I *think* it's bug free, but you never know. Outstanding questions: - I use zlib because the builtin pg_lzcompress can't do what zlib does. Here we setup input and output buffers and zlib will process as much as it can (input empty or output full). This means no marshalling is required. We can compress the whole file without having it in memory. - zlib allocates memory for compression and decompression, I don't know how much. However, it allocates via the postgres mcxt system so it shouldn't too hard to find out. Simon pointed out that we'll need to track this because we might allow hundreds of tapes. - Each tape is compressed as one long compressed stream. Currently no seeking is allowed, so only sorts, no joins! (As tom said, quick and dirty numbers). This should show this possibility in its best light but if we want to support seeking we're going to need to change that. Maybe no compression on the last pass? - It's probable that the benefits are strongly correlated to the speed of your disk subsystem. We need to measure this effect. I can't accuratly measure this because my compiler doesn't inline any of the functions in tuplesort.c. In my test of a compression ratio around 100-to-1, on 160MB of data with tiny work_mem on my 5 year old laptop, it speeds it up by 60% so it's obviously not a complete waste of time. Ofcourse, YMMV :) Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Вложения
В списке pgsql-patches по дате отправления: