Re: BUG #15346: Replica fails to start after the crash
От | Michael Paquier |
---|---|
Тема | Re: BUG #15346: Replica fails to start after the crash |
Дата | |
Msg-id | 20180828024409.GB29157@paquier.xyz обсуждение исходный текст |
Ответ на | Re: BUG #15346: Replica fails to start after the crash (Alexander Kukushkin <cyberdemn@gmail.com>) |
Ответы |
Re: BUG #15346: Replica fails to start after the crash
Re: BUG #15346: Replica fails to start after the crash Re: BUG #15346: Replica fails to start after the crash |
Список | pgsql-bugs |
On Sat, Aug 25, 2018 at 09:54:39AM +0200, Alexander Kukushkin wrote: > Why the number of tuples in the xlog is greater than the number of > tuples on the index page? > Because this page was already overwritten and its LSN is HIGHER than > the current LSN! That's annoying. Because that means that the control file of your server maps to a consistent point which is older than some of the relation pages. How was the base backup of this node created? Please remember that when taking a base backup from a standby, you should backup the control file last, as there is no control of end backup with records available. So it seems to me that the origin of your problem comes from an incorrect base backup expectation? > Is there a way to recover from such a situation? Should the postgres > in such case do comparison of LSNs and if the LSN on the page is > higher than the current LSN simply return InvalidTransactionId? > Apparently, if there are no connections open postgres simply is not > running this code and it seems ok. One idea I have would be to copy all the WAL segments up to the point where the pages to-be-updated are, and let Postgres replay all the local WALs first. However it is hard to say if that would be enough, as you could have more references to pages even newer than the btree one you just found. -- Michael
Вложения
В списке pgsql-bugs по дате отправления: