Re: Compressed binary field

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: Compressed binary field
Дата
Msg-id CAMkU=1yENbW+cjA-L8Najitv=E-7Bqa4re1Uuamujcgd+OTpNg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Compressed binary field  (Edson Richter <edsonrichter@hotmail.com>)
Ответы Re: Compressed binary field
Список pgsql-general
On Tue, Sep 11, 2012 at 9:34 AM, Edson Richter <edsonrichter@hotmail.com> wrote:
>
> No, there is no problem. Just trying to reduce database size forcing these
> fields to compress.
> Actual database size = 8Gb
> Backup size = 1.6Gb (5x smaller)
>
> Seems to me (IMHO) that there is room for improvement in database storage
> (we don't have many indexes, and biggest tables are just the ones with bytea
> fields). That's why I've asked for experts counseling.

There are two things to keep in mind.  One is that each datum is
compressed separately, so that a lot of redundancy that occurs between
fields of different tuples, but not within any given tuple, will not
be available to TOAST, but will be available to the compression of a
dump file.

Another thing is that PG's TOAST compression was designed to be simple
and fast and patent free, and often it is not all that good.  It is
quite good if you have long stretches of repeats of a single
character, or exact densely spaced repeats of a sequence of characters
("123123123123123..."), but when the redundancy is less simple it does
a much worse job than gzip, for example, does.

It is possible but unlikely there is a bug somewhere, but most likely
your documents just aren't very compressible using pglz_compress.

Cheers,

Jeff


В списке pgsql-general по дате отправления:

Предыдущее
От: Sébastien Lorion
Дата:
Сообщение: Re: Amazon High I/O instances
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: Index creation takes more time?