Re: parallelizing the archiver
От | Julien Rouhaud |
---|---|
Тема | Re: parallelizing the archiver |
Дата | |
Msg-id | CAOBaU_ZFXHgZo=X6-vUscgKuwvXC1pTKeerDsZyEbbFbyjt0bg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: parallelizing the archiver (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: parallelizing the archiver
Re: parallelizing the archiver |
Список | pgsql-hackers |
On Fri, Sep 10, 2021 at 9:13 PM Robert Haas <robertmhaas@gmail.com> wrote: > > To me, it seems way more beneficial to think about being able to > invoke archive_command with many files at a time instead of just one. > I think for most plausible archive commands that would be way more > efficient than what you propose here. It's *possible* that if we had > that, we'd still want this, but I'm not even convinced. Those approaches don't really seems mutually exclusive? In both case you will need to internally track the status of each WAL file and handle non contiguous file sequences. In case of parallel commands you only need additional knowledge that some commands is already working on a file. Wouldn't it be even better to eventually be able launch multiple batches of multiple files rather than a single batch? If we start with parallelism first, the whole ecosystem could immediately benefit from it as is. To be able to handle multiple files in a single command, we would need some way to let the server know which files were successfully archived and which files weren't, so it requires a different communication approach than the command return code. But as I said, I'm not convinced that using the archive_command approach for that is the best approach If I understand correctly, most of the backup solutions would prefer to have a daemon being launched and use it at a queuing system. Wouldn't it be better to have a new archive_mode, e.g. "daemon", and have postgres responsible to (re)start it, and pass information through the daemon's stdin/stdout or something like that?
В списке pgsql-hackers по дате отправления: