Re: [HACKERS] Vacuum analyze bug CAUGHT
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] Vacuum analyze bug CAUGHT |
Дата | |
Msg-id | 13452.937004740@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | RE: [HACKERS] Vacuum analyze bug CAUGHT ("Hiroshi Inoue" <Inoue@tpf.co.jp>) |
Ответы |
Re: [HACKERS] Vacuum analyze bug CAUGHT
|
Список | pgsql-hackers |
"Hiroshi Inoue" <Inoue@tpf.co.jp> writes: >>>> Moreover CREATE TABLE doesn't acquire any lock for pg_attribute >>>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>>> while tuples are inserted into pg_attribute. >>>> Concurrent vacuum may corrupt pg_attribute. >> >> Should be fixed! >> > Seems CREATE TABLE don't acquire any lock for pg_relcheck and > pg_attrdef as well as pg_attribute. There may be other pg_....... > Here is a patch. Hmm, do we really need to grab AccessExclusiveLock on the pg_ tables while creating or deleting tables? That will mean no concurrency at all for these operations. Seems to me we want AccessExclusiveLock on the table being created or deleted, but something less strong on the system tables. RowExclusiveLock might be appropriate --- Vadim, what do you think? Also, rather than running around and adding locks to every single place that calls heap_open or heap_close, I wonder whether we shouldn't have heap_open/heap_close themselves automatically grab or release at least a minimal lock (AccessShareLock, I suppose). Or maybe better: change heap_open/heap_openr/heap_close to take an additional parameter specifying the kind of lock to grab. That'd still mean having to visit all the call sites, but it would force people to think about the issue in future rather than forgetting to lock a table they're accessing. Comments? BTW, while I still haven't been able to reproduce Michael Simms' crash reliably, I did see one coredump caused by an assert failure in heap_delete(): #6 0x16181c in ExceptionalCondition ( conditionName=0x283d4 "!(( lp)->lp_flags & 0x01)", exceptionP=0x40009a80, detail=0x0,fileName=0x7ae4 "\003", lineNumber=1121) at assert.c:72 #7 0x7cc18 in heap_delete (relation=0x400891c0, tid=0x402d962c, ctid=0x0) at heapam.c:1121 #8 0x9c208 in DeleteAttributeTuples (rel=0x40535260) at heap.c:1118 #9 0x9c4dc in heap_destroy_with_catalog ( relname=0x4e4ed7 <Address 0x4e4ed7 out of bounds>) at heap.c:1310 #10 0xa5168 in RemoveRelation ( name=0x80db9380 <Address 0x80db9380 out of bounds>) at creatinh.c:157 #11 0x129760 in ProcessUtility (parsetree=0x402d8d28, dest=Remote) at utility.c:215 This was in the process doing table creates/drops, and I surmise that the problem was a tuple move executed concurrently by the process doing VACUUM. In other words, it looks like this problem of missing lock operations might be the cause, or one cause, of Michael's symptoms. regards, tom lane
В списке pgsql-hackers по дате отправления: