Re: [SQL] Re: lztext and compression ratios...
От | JanWieck@t-online.de (Jan Wieck) |
---|---|
Тема | Re: [SQL] Re: lztext and compression ratios... |
Дата | |
Msg-id | 200007062109.XAA19800@hot.jw.home обсуждение исходный текст |
Ответ на | Re: lztext and compression ratios... (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-general |
Tom Lane wrote: > JanWieck@t-online.de (Jan Wieck) writes: > >> As long as you brought it up: how sure are you that the method you've > >> used is not subject to any patents? > > > Now that you ask for it: I'm not sure. Could be. > > >> If you can show that this method uses no ideas not found in zlib, > >> then I'll feel reassured > > > To do so I don't know enough about the algorithms used in > > zlib. Is there someone out here who could verify that if I > > detailed enough describe what our compression code does? > > After a quick look at the code, I don't think there is anything > problematic about the data representation or the decompression > algorithm. The compression algorithm is another story, and it's > not real well commented :-(. The important issues are how you > search for matches in the past text and how you decide which match > is the best one to use. Please update the code comments to describe > that, and I'll take another look. Done. You'll find a new section in the top comments. While writing it I noticed that the algorithm is really expensive for big items. The history lookup table allocated is 8 times (on 32 bit architectures) the size of the input. So if you want to have 1MB compressed, it'll allocate 8MB for the history. It hit me when I was hunting a bug in the toaster earlier today. Doing an update to a toasted item of 5MB, resulting in a new value of 10MB, the backend blew up to 290MB of virtual memory - oh boy. I definitely need to make that smarter. When I wrote it I never thought about items that big. It was before we had the idea of TOAST. This all might open another discussion I'll start in a separate thread. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
В списке pgsql-general по дате отправления: