Re: Sync Rep: First Thoughts on Code

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: Sync Rep: First Thoughts on Code
Дата
Msg-id 3f0b79eb0812230253r51bd1195i5f7fc1df6b7d810@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Sync Rep: First Thoughts on Code  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: Sync Rep: First Thoughts on Code  ("Pavan Deolasee" <pavan.deolasee@gmail.com>)
Список pgsql-hackers
Hi,

On Tue, Dec 23, 2008 at 6:28 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On Tue, 2008-12-23 at 18:00 +0900, Fujii Masao wrote:
>> > I don't get this argument. Why would we care what happens on the
>> failed server?
>>
>> It's because, in the future, I'd like to use the data on the failed
>> server when making it catch up with new primary. This desire might be
>> violated by the inconsistency which I described.
>
> I don't really understand why you would put something in there that has
> no use at all. Why make every server in the world do extra
> synchronisation?
>
> Whatever you build in the future can include this, if that is still a
> required point at the time you add the new feature.

Right. But since it's difficult to change the once fixed specification,
I ruminate about it from now for future.

But, since I cannot obtain consensus from hackers including you,
I would change my course, and forbid XLogFlush (called from other
than RecordTransactionCommit) to replicate xlog synchronously
if asynchronous replication case.

BTW, here is the callers other than RecordTransactionCommit.
- CreateCheckPoint()
- EndPrepare()
- FlushBuffer()
- RecordTransactionAbortPrepared()
- RecordTransactionCommitPrepared()
- RelationTruncate()
- SlruPhysicalWritePage()
- WriteTruncateXlogRec()
- XLogAsyncCommitFlush()

>
> Are you thinking about switchover rather than failover? I'm sure a
> graceful switchover doesn't need this.

Yes, switchover is one of case example I care. Typically, I care
about restarting the failed server (original primary) after failover:

-------------
1. a dirty buffer page is chosen as victim of buffer replacement
2. flush xlog up to the buffer's LSN on only primary
3. write out the dirty buffer page
4. primary fails   (replication up to buffer's LSN is not performed)

The above case produces inconsistency between data on the
original primary (failed server) and xlogs on the original standby
(new primary after failover). Isn't this right?

5. restart the failed server and make it catch up with new primary

We cannot recycle the existing data on the failed server because
of that inconsistency. I think this restriction should be removed.
-------------

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: [PATCHES] Infrastructure changes for recovery (v8)
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: encoding cleanups in cvs repo