Обсуждение: pg_basebackup compression TODO item

Поиск
Список
Период
Сортировка

pg_basebackup compression TODO item

От
Jeff Janes
Дата:
Since SSL compression seems to be a busted flush, I would like to see
pg_basebackup be able to do compression on the server end, not just
the client end, in order to spare network bandwidth.

Any comments on how hard this would be, or why we don't want it?

Cheers,

Jeff



Re: pg_basebackup compression TODO item

От
Magnus Hagander
Дата:
On Thu, Mar 3, 2016 at 6:23 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
Since SSL compression seems to be a busted flush, I would like to see
pg_basebackup be able to do compression on the server end, not just
the client end, in order to spare network bandwidth.

Any comments on how hard this would be, or why we don't want it?

I think we want it at protocol level rather than pg_basebackup level. If SSL compression is busted on base backups, it's equally busted on regular connection and replication streams. People do ask for compression on that (in particular I've had a lot of requests when it comes to replication), and our response there has traditionally been "ssl compression"... 

--

Re: pg_basebackup compression TODO item

От
Andres Freund
Дата:
On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote:
> I think we want it at protocol level rather than pg_basebackup level.

I think we may want both eventually, but I do agree that protocol level
has a lot higher "priority" than that. Something like protocol level
compression has a bit of different tradeofs than compressing base
backups, and it's nice not to compress, uncompress, compress again.


> If SSL compression is busted on base backups, it's equally busted on
> regular connection and replication streams. People do ask for
> compression on that (in particular I've had a lot of requests when it
> comes to replication), and our response there has traditionally been
> "ssl compression"...

Agreed. I think our answer there was always a bit of a cop out...


Andres



Re: pg_basebackup compression TODO item

От
"Joshua D. Drake"
Дата:
On 03/03/2016 09:23 AM, Jeff Janes wrote:
> Since SSL compression seems to be a busted flush, I would like to see
> pg_basebackup be able to do compression on the server end, not just
> the client end, in order to spare network bandwidth.
>
> Any comments on how hard this would be, or why we don't want it?

We want it and we want it over multiple workers.

JD

>
> Cheers,
>
> Jeff
>
>


-- 
Command Prompt, Inc.                  http://the.postgres.company/                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.



Re: pg_basebackup compression TODO item

От
Stephen Frost
Дата:
* Andres Freund (andres@anarazel.de) wrote:
> On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote:
> > I think we want it at protocol level rather than pg_basebackup level.
>
> I think we may want both eventually, but I do agree that protocol level
> has a lot higher "priority" than that. Something like protocol level
> compression has a bit of different tradeofs than compressing base
> backups, and it's nice not to compress, uncompress, compress again.

+1, the whole compress-uncompress-compress thing was why I was trying to
add support to COPY to do zlib compression, which could have then been
used to compress server-side and then just write the results out to a
file for -Fc/-Fd style dumps.  We ended up implementing the 'PROGRAM'
thing for COPY, which is nice, but isn't the same.

> > If SSL compression is busted on base backups, it's equally busted on
> > regular connection and replication streams. People do ask for
> > compression on that (in particular I've had a lot of requests when it
> > comes to replication), and our response there has traditionally been
> > "ssl compression"...
>
> Agreed. I think our answer there was always a bit of a cop out...

Agreed on this also.

Thanks!

Stephen

Re: pg_basebackup compression TODO item

От
Magnus Hagander
Дата:
On Thu, Mar 3, 2016 at 6:34 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote:
> I think we want it at protocol level rather than pg_basebackup level.

I think we may want both eventually, but I do agree that protocol level
has a lot higher "priority" than that. Something like protocol level
compression has a bit of different tradeofs than compressing base
backups, and it's nice not to compress, uncompress, compress again.


Yeah, good point, we definitely want both. Based on the field experience I've had (which might differ from others), having it protocol level would help more people tough, so should be higher prio.

--

Re: pg_basebackup compression TODO item

От
"Joshua D. Drake"
Дата:
On 03/03/2016 09:34 AM, Andres Freund wrote:
> On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote:
>> I think we want it at protocol level rather than pg_basebackup level.
>
> I think we may want both eventually, but I do agree that protocol level
> has a lot higher "priority" than that. Something like protocol level
> compression has a bit of different tradeofs than compressing base
> backups, and it's nice not to compress, uncompress, compress again.

Agreed. This is something that we have neglected and argued against for 
no good reason for over a decade. It is time to get it done.

Sincerely,

JD


-- 
Command Prompt, Inc.                  http://the.postgres.company/                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.



Re: pg_basebackup compression TODO item

От
Andres Freund
Дата:
On 2016-03-03 18:44:24 +0100, Magnus Hagander wrote:
> On Thu, Mar 3, 2016 at 6:34 PM, Andres Freund <andres@anarazel.de> wrote:
> 
> > On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote:
> > > I think we want it at protocol level rather than pg_basebackup level.
> >
> > I think we may want both eventually, but I do agree that protocol level
> > has a lot higher "priority" than that. Something like protocol level
> > compression has a bit of different tradeofs than compressing base
> > backups, and it's nice not to compress, uncompress, compress again.

> Yeah, good point, we definitely want both. Based on the field experience
> I've had (which might differ from others), having it protocol level would
> help more people tough, so should be higher prio.

Agreed. But then our priorities are not necessary the implementers, and
I don't think there's strong enough architectural reasons to only accept
protocol level for now...

Andres



Re: pg_basebackup compression TODO item

От
Euler Taveira
Дата:
On 03-03-2016 14:44, Magnus Hagander wrote:
> On Thu, Mar 3, 2016 at 6:34 PM, Andres Freund <andres@anarazel.de
> <mailto:andres@anarazel.de>> wrote:
> 
>     On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote:
>     > I think we want it at protocol level rather than pg_basebackup level.
> 
>     I think we may want both eventually, but I do agree that protocol level
>     has a lot higher "priority" than that. Something like protocol level
>     compression has a bit of different tradeofs than compressing base
>     backups, and it's nice not to compress, uncompress, compress again.
> 
> 
> 
> Yeah, good point, we definitely want both. Based on the field experience
> I've had (which might differ from others), having it protocol level
> would help more people tough, so should be higher prio.
> 
Some time ago, I started a thread [1] to implement compression at
protocol level. The use cases are data load over slow links and reduce
bandwidth consumption during replication.

At that time, there wasn't a consensus about which compression algorithm
to choose. After the WAL compression feature, I think we can do some POC
with LZ compression (that is already available in common).

I'll try to update the code and do some benchmarks.


[1] http://www.postgresql.org/message-id/4FD9698F.2090407@timbira.com


--   Euler Taveira                   Timbira - http://www.timbira.com.br/  PostgreSQL: Consultoria, Desenvolvimento,
Suporte24x7 e Treinamento
 



Re: pg_basebackup compression TODO item

От
Benedikt Grundmann
Дата:


On Sun, Mar 6, 2016 at 7:36 PM, Euler Taveira <euler@timbira.com.br> wrote:
On 03-03-2016 14:44, Magnus Hagander wrote:
> On Thu, Mar 3, 2016 at 6:34 PM, Andres Freund <andres@anarazel.de
> <mailto:andres@anarazel.de>> wrote:
>
>     On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote:
>     > I think we want it at protocol level rather than pg_basebackup level.
>
>     I think we may want both eventually, but I do agree that protocol level
>     has a lot higher "priority" than that. Something like protocol level
>     compression has a bit of different tradeofs than compressing base
>     backups, and it's nice not to compress, uncompress, compress again.
>
>
>
> Yeah, good point, we definitely want both. Based on the field experience
> I've had (which might differ from others), having it protocol level
> would help more people tough, so should be higher prio.
>
Some time ago, I started a thread [1] to implement compression at
protocol level. The use cases are data load over slow links and reduce
bandwidth consumption during replication.

At that time, there wasn't a consensus about which compression algorithm
to choose. After the WAL compression feature, I think we can do some POC
with LZ compression (that is already available in common).

I'll try to update the code and do some benchmarks.


+1 to protocol level compression.  In our case the primary reasons why we use thirdparty magic networking appliances as a middle man between our offices is to compress postgres network traffic (which is very compress-able that is > 95% reduction is normal).  And the presence of those devices introduces all kinds of weird additional error cases and administrative overhead (+ of course cost).  So I would personally consider protocol level compression to be bigger killer feature than any other feature that has made itself into postgres since the 9.2 release. But of course YMMV ;-) 

 
[1] http://www.postgresql.org/message-id/4FD9698F.2090407@timbira.com


--
   Euler Taveira                   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers