Re: BUG #5443: Undetected deadlock situation
От | Claudio Freire |
---|---|
Тема | Re: BUG #5443: Undetected deadlock situation |
Дата | |
Msg-id | 1275586454.24950.8.camel@klauss.livra.local обсуждение исходный текст |
Ответ на | Re: BUG #5443: Undetected deadlock situation (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-bugs |
On Fri, 2010-04-30 at 11:50 -0400, Tom Lane wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > > Eliminating null columns and mangling column headers for length, I > > get this: > > > locktype | tranid | virtualx | pid | mode | gr > > transactionid | 39773877 | 63/15761 | 11157 | ShareLock | f > > transactionid | 39773877 | 4/10902 | 6421 | ExclusiveLock | t > > > So it looks like two locks on the same transaction ID by different > > transactions. How does that happen? > > That's perfectly normal --- it indicates that pid 11157 is waiting for > a row-level lock that's currently held by pid 6421. We translate > row-level locking delays into waits for XID locks in order to limit > the number of distinct locks that have to be remembered in the shared > lock table. (We'd soon blow out shared memory if per-row lock data had > to be kept there.) > > As Peter stated, there's no evidence of an actual problem in this > bug report. I'd go looking for clients sitting with open > transactions... It may be that indeed there isn't a deadlock, but an abnormal performance drop. I couldn't get a gdb trace before I just had to work around the issue since it happens in a production server, and when it does it means trouble for our app. The workaround is to break the transaction into a SELECT for the rows to be updated, followed by individual updates batched in transactions of a couple hundred. With that pattern, everything works as expected. The SELECT takes a sweet time (but doesn't block any other transaction), and the updates go pretty fast thanks to HOT. I didn't try a SELECT FOR UPDATE followed by all the updates in a single transaction yet, that would result in the same behavior as the massive update, but perhaps without the "deadlock" (or performance drop). What I did do is analyze server load during the events, and as I suspected, disk activity during the "deadlocks" seems to suggest a vacuuming taking place. Although there was no autovacuum entry in pg_stat_activity every time I checked, disk activity precisely matches the case when autovacuum decides to vacuum a big table. That's about as much information I can give. We've worked around the issue successfully and it hasn't happened since. Even if it is not a proper deadlock, the performance drop is unacceptable. I've done massive updates before, and that performance drop was not expected (more than one day for updating 30k rows on a table with a couple indices).
В списке pgsql-bugs по дате отправления: