Обсуждение: pg_basebackup compression TODO item
Since SSL compression seems to be a busted flush, I would like to see pg_basebackup be able to do compression on the server end, not just the client end, in order to spare network bandwidth. Any comments on how hard this would be, or why we don't want it? Cheers, Jeff
On Thu, Mar 3, 2016 at 6:23 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
Since SSL compression seems to be a busted flush, I would like to see
pg_basebackup be able to do compression on the server end, not just
the client end, in order to spare network bandwidth.
Any comments on how hard this would be, or why we don't want it?
I think we want it at protocol level rather than pg_basebackup level. If SSL compression is busted on base backups, it's equally busted on regular connection and replication streams. People do ask for compression on that (in particular I've had a lot of requests when it comes to replication), and our response there has traditionally been "ssl compression"...
On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote: > I think we want it at protocol level rather than pg_basebackup level. I think we may want both eventually, but I do agree that protocol level has a lot higher "priority" than that. Something like protocol level compression has a bit of different tradeofs than compressing base backups, and it's nice not to compress, uncompress, compress again. > If SSL compression is busted on base backups, it's equally busted on > regular connection and replication streams. People do ask for > compression on that (in particular I've had a lot of requests when it > comes to replication), and our response there has traditionally been > "ssl compression"... Agreed. I think our answer there was always a bit of a cop out... Andres
On 03/03/2016 09:23 AM, Jeff Janes wrote: > Since SSL compression seems to be a busted flush, I would like to see > pg_basebackup be able to do compression on the server end, not just > the client end, in order to spare network bandwidth. > > Any comments on how hard this would be, or why we don't want it? We want it and we want it over multiple workers. JD > > Cheers, > > Jeff > > -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
* Andres Freund (andres@anarazel.de) wrote: > On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote: > > I think we want it at protocol level rather than pg_basebackup level. > > I think we may want both eventually, but I do agree that protocol level > has a lot higher "priority" than that. Something like protocol level > compression has a bit of different tradeofs than compressing base > backups, and it's nice not to compress, uncompress, compress again. +1, the whole compress-uncompress-compress thing was why I was trying to add support to COPY to do zlib compression, which could have then been used to compress server-side and then just write the results out to a file for -Fc/-Fd style dumps. We ended up implementing the 'PROGRAM' thing for COPY, which is nice, but isn't the same. > > If SSL compression is busted on base backups, it's equally busted on > > regular connection and replication streams. People do ask for > > compression on that (in particular I've had a lot of requests when it > > comes to replication), and our response there has traditionally been > > "ssl compression"... > > Agreed. I think our answer there was always a bit of a cop out... Agreed on this also. Thanks! Stephen
On Thu, Mar 3, 2016 at 6:34 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote:
> I think we want it at protocol level rather than pg_basebackup level.
I think we may want both eventually, but I do agree that protocol level
has a lot higher "priority" than that. Something like protocol level
compression has a bit of different tradeofs than compressing base
backups, and it's nice not to compress, uncompress, compress again.
Yeah, good point, we definitely want both. Based on the field experience I've had (which might differ from others), having it protocol level would help more people tough, so should be higher prio.
On 03/03/2016 09:34 AM, Andres Freund wrote: > On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote: >> I think we want it at protocol level rather than pg_basebackup level. > > I think we may want both eventually, but I do agree that protocol level > has a lot higher "priority" than that. Something like protocol level > compression has a bit of different tradeofs than compressing base > backups, and it's nice not to compress, uncompress, compress again. Agreed. This is something that we have neglected and argued against for no good reason for over a decade. It is time to get it done. Sincerely, JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them.
On 2016-03-03 18:44:24 +0100, Magnus Hagander wrote: > On Thu, Mar 3, 2016 at 6:34 PM, Andres Freund <andres@anarazel.de> wrote: > > > On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote: > > > I think we want it at protocol level rather than pg_basebackup level. > > > > I think we may want both eventually, but I do agree that protocol level > > has a lot higher "priority" than that. Something like protocol level > > compression has a bit of different tradeofs than compressing base > > backups, and it's nice not to compress, uncompress, compress again. > Yeah, good point, we definitely want both. Based on the field experience > I've had (which might differ from others), having it protocol level would > help more people tough, so should be higher prio. Agreed. But then our priorities are not necessary the implementers, and I don't think there's strong enough architectural reasons to only accept protocol level for now... Andres
On 03-03-2016 14:44, Magnus Hagander wrote: > On Thu, Mar 3, 2016 at 6:34 PM, Andres Freund <andres@anarazel.de > <mailto:andres@anarazel.de>> wrote: > > On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote: > > I think we want it at protocol level rather than pg_basebackup level. > > I think we may want both eventually, but I do agree that protocol level > has a lot higher "priority" than that. Something like protocol level > compression has a bit of different tradeofs than compressing base > backups, and it's nice not to compress, uncompress, compress again. > > > > Yeah, good point, we definitely want both. Based on the field experience > I've had (which might differ from others), having it protocol level > would help more people tough, so should be higher prio. > Some time ago, I started a thread [1] to implement compression at protocol level. The use cases are data load over slow links and reduce bandwidth consumption during replication. At that time, there wasn't a consensus about which compression algorithm to choose. After the WAL compression feature, I think we can do some POC with LZ compression (that is already available in common). I'll try to update the code and do some benchmarks. [1] http://www.postgresql.org/message-id/4FD9698F.2090407@timbira.com -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte24x7 e Treinamento
On Sun, Mar 6, 2016 at 7:36 PM, Euler Taveira <euler@timbira.com.br> wrote:
On 03-03-2016 14:44, Magnus Hagander wrote:
> On Thu, Mar 3, 2016 at 6:34 PM, Andres Freund <andres@anarazel.de
> <mailto:andres@anarazel.de>> wrote:
>
> On 2016-03-03 18:31:03 +0100, Magnus Hagander wrote:
> > I think we want it at protocol level rather than pg_basebackup level.
>
> I think we may want both eventually, but I do agree that protocol level
> has a lot higher "priority" than that. Something like protocol level
> compression has a bit of different tradeofs than compressing base
> backups, and it's nice not to compress, uncompress, compress again.
>
>
>
> Yeah, good point, we definitely want both. Based on the field experience
> I've had (which might differ from others), having it protocol level
> would help more people tough, so should be higher prio.
>
Some time ago, I started a thread [1] to implement compression at
protocol level. The use cases are data load over slow links and reduce
bandwidth consumption during replication.
At that time, there wasn't a consensus about which compression algorithm
to choose. After the WAL compression feature, I think we can do some POC
with LZ compression (that is already available in common).
I'll try to update the code and do some benchmarks.
+1 to protocol level compression. In our case the primary reasons why we use thirdparty magic networking appliances as a middle man between our offices is to compress postgres network traffic (which is very compress-able that is > 95% reduction is normal). And the presence of those devices introduces all kinds of weird additional error cases and administrative overhead (+ of course cost). So I would personally consider protocol level compression to be bigger killer feature than any other feature that has made itself into postgres since the 9.2 release. But of course YMMV ;-)
[1] http://www.postgresql.org/message-id/4FD9698F.2090407@timbira.com
--
Euler Taveira Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers