lztext and compression ratios...
От | Jeffery Collins |
---|---|
Тема | lztext and compression ratios... |
Дата | |
Msg-id | 39636960.FEF2400C@onyx-technologies.com обсуждение исходный текст |
Ответы |
Re: lztext and compression ratios...
Re: lztext and compression ratios... |
Список | pgsql-general |
I have been looking at using the lztext type and I have some questions/observations. Most of my experience comes from attempting to compress text records in a different database (CTREE), but I think the experience is transferable. My typical table consists of variable length text records. The average length record is around 1K bytes. I would like to compress my records to save space and improve I/O performance (smaller records means more records fit into the file system cache which means less I/O - or so the theory goes). I am not too concerned about CPU as we are using a 4-way Sun Enterprise class server. So compress seems like a good idea to me. My experience with attempting to compress such a relatively small (around 1K) text string is that the compression ration is not very good. This is because the string is not long enough for the LZ compression algorithm to establish really good compression patterns and the fact that the de-compression table has to be built into each record. What I have done in the past to get around these problems is that I have "taught" the compression algorithm the patterns ahead of time and stored the de-compression patterns in an external table. Using this technique, I have achieved *much* better compression ratios. So my questions/comments are: - What are the typical compression rations on relatively small (i.e. around 1K) strings seen with lztext? - Does anyone see a need/use for a generalized string compression type that can be "trained" external to the individual records? - Am I crazy in even attempting to compress strings of this relative size? My largest table correct contains about 2 million entries of roughly 1k size strings or about 2Gig of data. If I could compress this to about 33% of it's original size (not unreasonable with a trained LZ compression), I would save a lot of disk space (not really important) and a lot of file system cache space (very important) and be able to fit the entire table into memory (very, very important). Thank you, Jeff
В списке pgsql-general по дате отправления: