Re: [HACKERS] compression in LO and other fields
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] compression in LO and other fields |
Дата | |
Msg-id | 26760.942419535@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [HACKERS] compression in LO and other fields (wieck@debis.com (Jan Wieck)) |
Ответы |
Re: [HACKERS] compression in LO and other fields
Re: [HACKERS] compression in LO and other fields Re: [HACKERS] compression in LO and other fields |
Список | pgsql-hackers |
wieck@debis.com (Jan Wieck) writes: > Tom Lane wrote: >> It occurred to me last night that applying compression to individual >> fields might not be the best approach. Certainly a "bytez" data type >> is the easiest thing to fit into the existing system, but it's leaving >> some space savings on the table. What about compressing the *whole* >> data contents of a tuple on-disk, as a single entity? That should save >> more space than field-by-field compression. > But it requires decompression of every tuple into palloc()'d > memory during heap access. AFAIK, the heap access routines > currently return a pointer to the tuple inside the shm > buffer. Don't know what it's performance impact would be. Good point, but the same will be needed when a tuple is split across multiple blocks. I would expect that (given a reasonably fast decompressor) there will be a net performance *gain* due to having less disk I/O to do. Also, this won't be happening for "every" tuple, just those exceeding a size threshold --- we'd be able to tune the threshold value to trade off speed and space. One thing that does occur to me is that we need to store the uncompressed as well as the compressed data size, so that the working space can be palloc'd before starting the decompression. Also, in case it wasn't clear, I was envisioning leaving the tuple header uncompressed, so that time quals etc can be checked before decompressing the tuple data. regards, tom lane
В списке pgsql-hackers по дате отправления: