Re: Assorted improvements in pg_dump
От | Hans Buschmann |
---|---|
Тема | Re: Assorted improvements in pg_dump |
Дата | |
Msg-id | 7d7eb6128f40401d81b3b7a898b6b4de@W2012-02.nidsa.loc обсуждение исходный текст |
Ответ на | Assorted improvements in pg_dump (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Assorted improvements in pg_dump
|
Список | pgsql-hackers |
Hello Tom! I noticed you are improving pg_dump just now. Some time ago I experimented with a customer database dump in parallel directory mode -F directory -j (2-4) I noticed it took quite long to complete. Further investigation showed that in this mode with multiple jobs the tables are processed in decreasing size order, whichmakes sense to avoid a long tail of a big table in one of the jobs prolonging overall dump time. Exactly one table took very long, but seemed to be of moderate size. But the size-determination fails to consider the size of toast tables and this table had a big associated toast-table ofbytea column(s). Even with an analyze at loading time there where no size information of the toast-table in the catalog tables. I thought of the following alternatives to ameliorate: 1. Using pg_table_size() function in the catalog query Pos: This reflects the correct size of every relation Neg: This goes out to disk and may take a huge impact on databases with very many tables 2. Teaching vacuum to set the toast-table size like it sets it on normal tables 3. Have a command/function for occasionly setting the (approximate) size of toast tables I think with further work under the way (not yet ready), pg_dump can really profit from parallel/not compressing mode, especiallyconsidering the huge amount of bytea/blob/string data in many big customer scenarios. Thoughts? Hans Buschmann
В списке pgsql-hackers по дате отправления: