Re: BUG #14321: pg_basebackup --xlog-method=stream fails
От | Michael Paquier |
---|---|
Тема | Re: BUG #14321: pg_basebackup --xlog-method=stream fails |
Дата | |
Msg-id | CAB7nPqR1bRAdE2SruRfpH39B5cO-sHO-_tqOrpTY0foJXvh-rw@mail.gmail.com обсуждение исходный текст |
Ответ на | BUG #14321: pg_basebackup --xlog-method=stream fails (Jürgen Strobel <juergen+postgresql@strobel.info>) |
Ответы |
Re: BUG #14321: pg_basebackup --xlog-method=stream fails
|
Список | pgsql-bugs |
On Sat, Sep 10, 2016 at 9:10 AM, J=C3=BCrgen Strobel <juergen+postgresql@strobel.info> wrote: > First, I do have another WAL archive (usually). > But no I only see the first WAL segments up to the point when the problem > occurs, then nothing more. > > The timeline as far as I can tell is: > > 1. pg_basebackup --xlog-method=3Dstream starts and creates 2 connections = for > backup and WAL streaming. > 2. The VM's crappy IO system hickups and stalls the whole VM for a > surprisingly long time. I know that people can do fancy things here, believe me. > 3. The server runs into wal_sender_timeout and closes the WAL streaming > connection. > 4. pg_basebackup prints the warning, and continues the filesystem copy, *= but > makes no effort to re-open the WAL streaming connection*. With ps I see > zombie child of the pg_basbackup process, I assume that's the one doing t= he > WAL streaming. > 5. pg_baseback finishes up with the second half of pg_xlog missing, and t= he > DB fails to start. > > In contrast if the same problem occurs while running pg_receivexlog it wa= its > for 5 seconds then reopens the connection. I think that pg_basebackup sho= uld > show the same resilience. You can blame your VM here to begin with :( Even with the default values of pg_basebackup --status-interval and wal_sender_timeout on the server there is enough margin to prevent things to get killed, but if things get heavily constrained on I/O... Well, there is not much than any software could do... Now I agree that there would be room for improvement to make pg_basebackup retry a stream instead of failing, and that may be something that people would be willing to have. But that's hard to think about improvements in this area as something else than a new feature, and not a bug. Anyway, replication slots would not help here if you just rely on pg_basebackup to finish the job. --=20 Michael
В списке pgsql-bugs по дате отправления: