Duplicate-key-detection failure case found in btree
От | Tom Lane |
---|---|
Тема | Duplicate-key-detection failure case found in btree |
Дата | |
Msg-id | 29336.1009914384@sss.pgh.pa.us обсуждение исходный текст |
Ответы |
Re: Duplicate-key-detection failure case found in btree
|
Список | pgsql-hackers |
I have found a case in nbtree where it is possible for two concurrent transactions to insert the same key value in a unique index :-( The reason is that _bt_check_unique can (of course) only detect keys that are already in the index. To prevent concurrent insertion of duplicate keys, we have to rely on locking. Two would-be inserters of the same key must lock the same target page in _bt_doinsert, so they cannot run the check at the same time. The second to obtain the lock will see the first one's already-inserted key. However, suppose that the index contains many existing instances of the same key (presumably pointing at deleted-but-not-yet-vacuumed tuples). If the existing instances span several index pages, it is likely that _bt_insertonpg will decide to insert the new key on one of the pages to the right of the original target key. As presently coded, _bt_insertonpg drops the write lock it has before acquiring write lock on the next page to the right. This creates a window wherein another process could perform _bt_check_unique, fail to find any non-dead instances of the key, and decide that it's okay to insert the same key. The fix is to acquire the next page's write lock before we drop the current page's. This is deadlock-free since no writer ever tries to chain write-locks to the left (in fact the same thing is done in the page split logic). I am not convinced that this explains any of the field reports of duplicate rows that we've seen, but it's clearly a bug. Will commit the fix shortly. regards, tom lane
В списке pgsql-hackers по дате отправления: