Re: [HACKERS] Race condition in GetOldestActiveTransactionId()
От | Heikki Linnakangas |
---|---|
Тема | Re: [HACKERS] Race condition in GetOldestActiveTransactionId() |
Дата | |
Msg-id | e4c4e884-0c00-2f44-2a20-1b29a031349d@iki.fi обсуждение исходный текст |
Ответ на | Race condition in GetOldestActiveTransactionId() (Heikki Linnakangas <hlinnaka@iki.fi>) |
Список | pgsql-hackers |
On 08/22/2016 01:46 PM, Heikki Linnakangas wrote: > While hacking on the CSN patch, I spotted a race condition between > GetOldestActiveTransactionId() and GetNewTransactionId(). > GetOldestActiveTransactionId() calculates the oldest XID that's still > running, by doing: > > 1. Read nextXid, without a lock. This is the upper bound, if there are > no active XIDs in the proc array. > 2. Loop through the proc array, making note of the oldest XID. > > While GetNewTransactionId() does: > > 1. Read and Increment nextXid > 2. Set MyPgXact->xid. > > It seems possible that if you call GetNewTransactionId() concurrently > with GetOldestActiveTransactionId(), GetOldestActiveTransactionId() sees > the new nextXid value that the concurrent GetNewTransactionId() set, but > doesn't see the old XID in the proc array. It will return a value that > doesn't cover the old XID, i.e. it won't consider the just-assigned XID > as in-progress. > > Am I missing something? Commit c6d76d7c added a comment to > GetOldestActiveTransactionId() explaining why it's not necessary to > acquire XidGenLock there, but I think it missed the above race condition. > > GetOldestActiveTransactionId() is not performance-critical, it's only > called when performing a checkpoint, so I think we should just bite the > bullet and grab the lock. Per attached patch. I did some testing, and was able to indeed construct a case where oldestActiveXid was off-by-one in an online checkpoint record. However, looking at how it's used, I think it happened to not have any user-visible effect. The oldestActiveXid value of an online checkpoint is only used to determine the point where pg_subtrans is initialized, and the XID missed in the computation could not be a subtransaction. Nevertheless, it's clearly a bug and the fix is simple, so I committed a fix. - Heikki
В списке pgsql-hackers по дате отправления: