BUG #13473: VACUUM FREEZE mistakenly cancel standby sessions
От | marco.nenciarini@2ndquadrant.it |
---|---|
Тема | BUG #13473: VACUUM FREEZE mistakenly cancel standby sessions |
Дата | |
Msg-id | 20150626134310.3876.82768@wrigleys.postgresql.org обсуждение исходный текст |
Ответы |
Re: BUG #13473: VACUUM FREEZE mistakenly cancel standby sessions
|
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 13473 Logged by: Marco Nenciarini Email address: marco.nenciarini@2ndquadrant.it PostgreSQL version: 9.4.4 Operating system: all Description: = Symptoms Let's have a simple master -> standby setup, with hot_standby_feedback activated, if a backend on standby is holding the cluster xmin and the master runs a VACUUM FREEZE on the same database of the standby's backend, it will generate a conflict and the query running on standby will be canceled. = How to reproduce it Run the following operation on an idle cluster. 1) connect to the standby and simulate a long running query: select pg_sleep(3600); 2) connect to the master and run the following script create table t(id int primary key); insert into t select generate_series(1, 10000); vacuum freeze verbose t; drop table t; 3) after 30 seconds the pg_sleep query on standby will be canceled. = Expected output The hot standby feedback should have prevented the query cancellation = Analysis Ive run postgres at DEBUG2 logging level, and I can confirm that the vacuum correctly see the OldestXmin propagated by the standby through the hot standby feedback. The issue is in heap_xlog_freeze function, which calls ResolveRecoveryConflictWithSnapshot as first thing, passing the cutoff_xid value as first argument. The cutoff_xid is the OldestXmin active when the vacuum, so it represents a running xid. The issue is that the function ResolveRecoveryConflictWithSnapshot expects as first argument of is latestRemovedXid, which represent the higher xid that has been actually removed, so there is an off-by-one error. I've been able to reproduce this issue for every version of postgres since 9.0 (9.0, 9.1, 9.2, 9.3, 9.4 and current master) = Proposed solution In the heap_xlog_freeze we need to subtract one to the value of cutoff_xid before passing it to ResolveRecoveryConflictWithSnapshot.
В списке pgsql-bugs по дате отправления: