Re: Inconsistent DB data in Streaming Replication
От | Samrat Revagade |
---|---|
Тема | Re: Inconsistent DB data in Streaming Replication |
Дата | |
Msg-id | CAF8Q-GyF=vrm+WLHhCLtLtg0skb_LkZwFEWjhRvcG=iybFyzwg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Inconsistent DB data in Streaming Replication (Hannu Krosing <hannu@2ndQuadrant.com>) |
Ответы |
Re: Inconsistent DB data in Streaming Replication
(Samrat Revagade <revagade.samrat@gmail.com>)
|
Список | pgsql-hackers |
<div dir="ltr"><p class="">>>it's one of the reasons why a fresh base backup is required when starting old master asnew standby? >>If yes, I agree with you. I've often heard the complaints about a backup when restarting new standby.>>That's really big problem.<p class="">I think Fujii Masao is on the same page.<p class=""> <p class="">>Incase of syncrep the master just waits for confirmation from standby before returning to client on >commit.<pclass="">>Not just commit, you must stop any *writing* of the wal records effectively killing any parallelism.<br/> > Min issue is that it will make *all* backends dependant on each sync commit, essentially serialisingall >backends commits, with the serialisation *including* the latency of roundtrip to client. With current>sync streaming the other backends can continue to write wal, with proposed approach you cannot >write any recordsafter the one waiting an ACK from standby.<p class=""> <p class="">Let me rephrase the proposal in a more accuratemanner:<p class="">Consider following scenario:<p class=""> <p class="">(1) A client sends the "COMMIT" command tothe master server.<p class=""><p class="">(2) The master writes WAL record to disk<p class="">(3) The master writes thedata page related to this transaction. i.e. via checkpoint or bgwriter.<p class="">(4) The master sends WAL records continuouslyto the standby, up to the commit WAL record.<p class="">(5) The standby receives WAL records, writes them tothe disk, and then replies the ACK.<p class="">(6) The master returns a success indication to a client after it receivesACK.<p class=""> <p class="">If failover happens between (3) and (4), WAL and DB data in old master are ahead ofthem in new master. After failover, new master continues running new transactions independently from old master. Then WALrecord and DB data would become inconsistent between those two servers. To resolve these inconsistencies, the backup ofnew master needs to be taken onto new standby.<p class=""><br /><p class="">But taking backup is not feasible in case oflarger database size with several TB over a slow WAN.<br /><p class=""><p class="">So to avoid this type of inconsistencywithout taking fresh backup we are thinking to do following thing:<p class=""> <br /><p class="">>> Ithink that you can introduce GUC specifying whether this extra check is required to avoid a backup >>when failback.<pclass="">Approach:<p class="">Introduce new GUC option specifying whether to prevent PostgreSQL from writing DBdata before corresponding WAL records have been replicated to the standby. That is, if this GUC option is enabled, PostgreSQLwaits for corresponding WAL records to be not only written to the disk but also replicated to the standby beforewriting DB data.<p class=""><br /><p class="">So the process becomes as follows:<p class="">(1) A client sends the"COMMIT" command to the master server.<p class="">(2) The master writes the commit WAL record to the disk.<p class="">(3)The master sends WAL records continuously to standby up to the commit WAL record.<p class="">(4) The standbyreceives WAL records, writes them to disk, and then replies the ACK.<p class="">(5) <b>The master then forces a writeof the data page related to this transaction. </b><p class="">(6) The master returns a success indication to a clientafter it receives ACK.<p class=""> <p class="">While master is waiting to force a write (point 5) for this data page,streaming replication continuous. Also other data page writes are not dependent on this particular page write. So thecommit of data pages are not serialized.<p class="" style="style"><br /><p class="" style="style">Regards,<p class=""style="style">Samrat<p class=""><br /></div>
В списке pgsql-hackers по дате отправления: