Re: Parallel pg_dump for 9.1
От | Stefan Kaltenbrunner |
---|---|
Тема | Re: Parallel pg_dump for 9.1 |
Дата | |
Msg-id | 4BB4A4AD.6080001@kaltenbrunner.cc обсуждение исходный текст |
Ответ на | Re: Parallel pg_dump for 9.1 (Jeff <threshar@torgo.978.org>) |
Список | pgsql-hackers |
Jeff wrote: > > On Mar 30, 2010, at 8:15 AM, Stefan Kaltenbrunner wrote: > >> Peter Eisentraut wrote: >>> On tis, 2010-03-30 at 08:39 +0200, Stefan Kaltenbrunner wrote: >>>> on fast systems pg_dump is completely CPU bottlenecked >>> Might be useful to profile why that is. I don't think pg_dump has >>> historically been developed with CPU efficiency in mind. >> >> It's not pg_dump that is the problem - it is COPY that is the limit. >> In my specific case als the fact that a lot of the columns are bytea >> adds to the horrible CPU overhead (fixed in 9.0). Still our bulk load >> & unload performance is still way slower on a per core comparision >> than a lot of other databases :( >> > > Don't forget the zlib compression used in -Fc (unless you use -Z0) takes > a fair amount of cpu too. > I did some tests and it turned out that -Z0 actually took longer than > -Z1 simply because there was a lot more data to write out, thus I became > IO bound not CPU bound. > > There's a thing called pigz around that is a parallel gzip > implementation - wonder how much of that could be adapted to pg_dumps > use as compression does use a considerable amount of time (even at > -Z1). The biggest problem I can immediately see is that it uses threads. all my numbers are with -Z0 and it is the backend (COPY and/or index creation) that is the limit. If you start using compression you are shifting the load to pg_dump. Stefan
В списке pgsql-hackers по дате отправления: