Another bug introduced by fastpath patch
От | Tom Lane |
---|---|
Тема | Another bug introduced by fastpath patch |
Дата | |
Msg-id | 4040.1385591144@sss.pgh.pa.us обсуждение исходный текст |
Ответы |
Re: Another bug introduced by fastpath patch
|
Список | pgsql-hackers |
In LockReleaseAll, we have this coding: for (partition = 0; partition < NUM_LOCK_PARTITIONS; partition++) { LWLockId partitionLock = FirstLockMgrLock+ partition; SHM_QUEUE *procLocks = &(MyProc->myProcLocks[partition]); proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks, offsetof(PROCLOCK,procLink)); if (!proclock) continue; /* needn't examine this partition */ LWLockAcquire(partitionLock, LW_EXCLUSIVE); ... process the proclock list ... LWLockRelease(partitionLock); } /* loop over partitions */ That is, we check the proclock list head before acquiring the matching partitionLock, and assume we can skip scanning this partition if the list is empty. Once upon a time that was safe, but since the fast-path patch it's not very safe, because some other backend could be in process of promoting one of our fast-path locks (and hence changing the proclock list). If we could be sure that we had removed *all* our fastpath locks in the earlier loop then it'd be okay, but because LockReleaseAll's name is a bit of a misnomer, I don't think we can assume that. The simplest fix is just to move the fetch of the list header down past the LWLockAcquire. This is pretty annoying, though, because in typical small transactions that will require a lot of additional partition lock acquire/releases. I think it might be safe to check the list header, skip the partition if it's null, but if it isn't then acquire the lock *and re-fetch the list header*. The idea here is that if there is someone else busy promoting one of our fastpath locks, it must be a lock we'd decided not to delete in the previous loop, and so we would again decide not to delete it in the main loop. Therefore, it doesn't matter if we decide to skip a partition microseconds before somebody changes the list header from null to not-null. However, once we've decided to process a partition, we'd better get an up-to-date view of the list state. This does do some damage to the original concept that allLocks mode would guarantee to clean up all locks of the target lockmethod even if the locallock table was damaged. I wonder whether we shouldn't get rid of the existing logic about fastpath in the loop over locallocks entirely, and replace it with code that just zeroes the fastpath array when lockmethodid == DEFAULT_LOCKMETHOD and allLocks. That would likely be faster than what you've got here as well as more robust. Or we could add a restriction to EligibleForRelationFastPath that restricts the fast-path mechanism to non-session locks, in which case we'd not need to make the zeroing contingent on allLocks either. I don't think we take any fast-path-eligible locks in session mode anyway, so this wouldn't be giving up any functionality on that end. Comments? regards, tom lane
В списке pgsql-hackers по дате отправления: