patch for parallel pg_dump
От | Joachim Wieland |
---|---|
Тема | patch for parallel pg_dump |
Дата | |
Msg-id | CACw0+11NeCcdv2dp4OsEYRoG8rEhKuVui7A8HwbOkBWCSygPXQ@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: patch for parallel pg_dump
Re: patch for parallel pg_dump |
Список | pgsql-hackers |
So this is the parallel pg_dump patch, generalizing the existing parallel restore and allowing parallel dumps for the directory archive format, the patch works on Windows and Unix. In the first phase of a parallel pg_dump/pg_restore, it does catalog backup/restore in a single process, then forks off worker processes which are connected to the master process by pipes (on Windows, the pg_pipe implementation is used). These pipes are only used for a few commands and status messages. The processes then work on the items that they get assigned to by the master, in other words the worker processes do not terminate after each item but stay there until the end of the parallel part of the dump/restore. Once they finish their current item and send the status back to the master, they are assigned the next item and so forth... In parallel restore, the master closes its own connection to the database before forking of worker processes, just as it does now. In parallel dump however, we need to hold the masters connection open so that we can detect deadlocks. The issue is that somebody could have requested an exclusive lock after the master has initially requested a shared lock on all tables. Therefore, the worker process also requests a shared lock on the table with NOWAIT and if this fails, we know that there is a conflicting lock in between and that we need to abort the dump. Parallel pg_dump sorts the tables and indexes by their sizes so that it can start with the largest items first. The connections of the parallel dump use the synchronized snapshot feature. However there's also an option --no-synchronized-snapshots which can be used to dump from an older PostgreSQL version. I'm also attaching another use-case for the parallel backup as a separate patch, which is a new archive format that I named "pg_backup_mirror", it's basically the parallel version of "pg_dump | psql", so it does a parallel dump and restore of a database from one host to another. The patch for this is fairly small, but it's still a bit rough and needs some more work and discussion. Depending on how quickly (or not) we get done with the review of the main patch, we can then include this one as well or postpone it. Joachim
Вложения
В списке pgsql-hackers по дате отправления: