Re: Significantly larger toast tables on 8.4?
От | Gregory Maxwell |
---|---|
Тема | Re: Significantly larger toast tables on 8.4? |
Дата | |
Msg-id | e692861c0901070644y6f55f441gb39397ab4aca736b@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Significantly larger toast tables on 8.4? (Martijn van Oosterhout <kleptog@svana.org>) |
Список | pgsql-hackers |
On Fri, Jan 2, 2009 at 5:48 PM, Martijn van Oosterhout <kleptog@svana.org> wrote: > So you compromise. You split the data into say 1MB blobs and compress > each individually. Then if someone does a substring at offset 3MB you > can find it quickly. This barely costs you anything in the compression > ratio mostly. > > Implementation though, that's harder. The size of the blobs is tunable > also. I imagine the optimal value will probably be around 100KB. (12 > blocks uncompressed). Or have the database do that internally: With the available fast compression algorithms (zlib; lzo; lzf; etc) the diminishing return from larger compression block sizes kicks in rather quickly. Other algos like LZMA or BZIP gain more from bigger block sizes, but I expect all of them are too slow to ever consider using in PostgreSQL. So, I expect that the compression loss from compressing in chunks of 64kbytes would be minimal. The database could then include a list of offsets for the 64kbyte chunks at the beginning of the field, or something like that. A short substring would then require decompressing just one or two blocks, far less overhead then decompressing everything. It would probably be worthwhile to graph compression ratio vs block size for some reasonable input. I'd offer to do it; but I doubt I have a reasonable test set for this.
В списке pgsql-hackers по дате отправления: