Re: incorrect handling of the timeout in pg_receivexlog
От | Heikki Linnakangas |
---|---|
Тема | Re: incorrect handling of the timeout in pg_receivexlog |
Дата | |
Msg-id | 4F30EF67.4030405@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: incorrect handling of the timeout in pg_receivexlog (Fujii Masao <masao.fujii@gmail.com>) |
Ответы |
Re: incorrect handling of the timeout in pg_receivexlog
Re: incorrect handling of the timeout in pg_receivexlog |
Список | pgsql-hackers |
On 07.02.2012 09:03, Fujii Masao wrote: > On Tue, Feb 7, 2012 at 2:58 PM, Fujii Masao<masao.fujii@gmail.com> wrote: >> When I compiled HEAD with --disable-integer-datetimes and tested >> pg_receivexlog, I encountered unexpected replication timeout. As >> far as I read the pg_receivexlog code, the cause of this problem is >> that pg_receivexlog handles the standby message timeout incorrectly >> in --disable-integer-datetimes. The attached patch fixes this problem. >> Comments? > > receivelog.c > ------- > timeout.tv_sec = last_status + standby_message_timeout - now - 1; > if (timeout.tv_sec<= 0) > ------- > > Umm.. the above code also handles the timestamp incorrectly. ISTM that the > root cause of these problems is that receivelog.c uses TimestampTz. Yep. While localGetCurrentTimestamp() returns a TimestampTz and handles float timestamps correctly, the caller just assigns the result to a int64 variable, assuming --enable-integer-datetimes. > What about changing receivelog.c so that it uses time_t instead of > TimestampTz? Which would make the code simpler, I think. Hmm, that would reduce granularity to seconds. The --statusint option is given in seconds, but it would be good to have more precision in the calculations to avoid rounding errors. But actually, if the purpose of the --statusint option is to avoid disconnection because of exceeding the server's replication_timeout, one second granularity just isn't enough to be begin with. replication_timeout is given in milliseconds, so if you set replication_timeout=900ms in the server, there is no way to make pg_basebackup/pg_receivexlog to reply in time. So, --statusint needs to be in milliseconds. And while we're at it, how difficult would be to ask the server for the current value of replication_timeout, and set --statusint automatically based on that? Or perhaps mark replication_timeout as GUC_REPORT. It is rather fiddly that depending on a server setting, you need to pass an option in the client or it will just silently fail with no indication of what the problem is. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: