pg_dump with 1100 schemas being a bit slow

Поиск
Список
Период
Сортировка
От Loic d'Anterroches
Тема pg_dump with 1100 schemas being a bit slow
Дата
Msg-id 8e2f2cb20910070351l6150c49eh9c5a44c6f74aedc3@mail.gmail.com
обсуждение исходный текст
Ответы Re: pg_dump with 1100 schemas being a bit slow  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: pg_dump with 1100 schemas being a bit slow  ("Joshua D. Drake" <jd@commandprompt.com>)
Re: pg_dump with 1100 schemas being a bit slow  ("Joshua D. Drake" <jd@commandprompt.com>)
Список pgsql-general
Hello,

After a series of sessions to search the web for information, I am
asking the help of people having a bit more knowledge of the internals
of pg_dump to try to solve a performance problem I have. I am running
PostgreSQL version 8.3.8 both server and pg_dump,

The context is a farm hosting of a web application
(http://www.indefero.net) where each installation get its own schema.
Each table in the schema is a "real" table, not a view of the same
table in the public schema with "WHERE schema='currentschema'" clause.
This setup allows me to easily run the web application nearly
unmodified between the downloadable version and the "farm" version
(schemas rock!). Now you have the background.

Each night I am running:
pg_dump --blobs --schema=%s --no-acl -U postgres indefero | gzip >
/path/to/backups/%s/%s-%s.sql.gz
this for each installation, so 1100 times. Substitution strings are to
timestamp and get the right schema.

My problem is that the dump increased steadily with the number of
schemas (now about 20s from about 12s with 850 schemas) and pg_dump is
now ballooning at 120MB of memory usage when running the dump.

The thing is that my overage schema size is a bit more than 1MB, with
the schema just after installation being 850kB.
Max size: 2.8MB
Min size: 0.85MB
Avg: 1.0MB
Total size: ~1GB

To get the size I run the following command with sum off the "size":

SELECT relname, pg_total_relation_size(CAST(relname AS
TEXT)) AS size FROM pg_class AS pgc, pg_namespace AS pgn
     WHERE pg_table_is_visible(pgc.oid) IS TRUE AND relkind = 'r'
     AND pgc.relnamespace = pgn.oid
     AND pgn.nspname NOT IN ('information_schema', 'pg_catalog')';


I think that pg_dump, when looking at the objects to dump, also it is
limited to a given schema, is scanning the complete database in one
those calls:
http://doxygen.postgresql.org/pg__dump_8c-source.html#l00717

Is there an option: "I know what I am doing, do not look outside of
the schema" available which can help in my case? Because running
SELECT is snappy, vmstats shows no swapping and the complete system is
basically running very well with a load average below 1.

So, your help is very welcomed,
loïc

--
Loïc d'Anterroches - Céondo Ltd - http://www.ceondo.com

В списке pgsql-general по дате отправления:

Предыдущее
От: Karina Guardado
Дата:
Сообщение: problems with encoding
Следующее
От: mezgani ali
Дата:
Сообщение: current_query stat is idle