RE: [HACKERS] Re: Concurrent VACUUM: first results
От | Hiroshi Inoue |
---|---|
Тема | RE: [HACKERS] Re: Concurrent VACUUM: first results |
Дата | |
Msg-id | 001601bf3a01$47d0ae60$2801007e@cadzone.tpf.co.jp обсуждение исходный текст |
Ответ на | Re: Concurrent VACUUM: first results (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [HACKERS] Re: Concurrent VACUUM: first results
|
Список | pgsql-hackers |
> > I have committed the code change to remove pg_vlock locking from VACUUM. > It turns out the problems I was seeing initially were all due to minor > bugs in the lock manager and vacuum itself. > > > 1. You can run concurrent "VACUUM" this way, but concurrent "VACUUM > > ANALYZE" blows up. The problem seems to be that "VACUUM ANALYZE"'s > > first move is to delete all available rows in pg_statistic. > > The real problem was that VACUUM ANALYZE tried to delete those rows > *while it was outside of any transaction*. If there was a concurrent > VACUUM inserting tuples into pg_statistic, the new VACUUM would end up > calling XactLockTableWait() with an invalid XID, which caused a failure Hmm,what I could have seen here was always LockRelation(..,RowExclu siveLock). But the cause may be same. We couldn't get xids of not running *transaction*s because its proc->xid is set to 0(InvalidTransactionId). So blocking transaction couldn' find an xidLookupEnt in xidTable corresponding to the not running *transaction* when it tries to LockResolveConflicts() in LockReleaseAll() and couldn't GrantLock() to XidLookupEnt corresponding to the not running *transac tion*. After all LockAcquire() from not running *transaction* always fails once it is blocked. > I have fixed the simpler aspects of the problem by adding missing > SpinRelease() calls to lock.c, making lmgr.c test for failure, and > altering VACUUM to not do the bogus row deletion. But I suspect that > there is more to this that I don't understand. Why does calling > XactLockTableWait() with an already-committed XID cause the following It's seems strange. Isn't it waiting for a being deleted tuple by vc_upd stats() in vc_vacone() ? > code in lock.c to trigger? Is this evidence of a logic bug in lock.c, > or at least of inadequate checks for bogus input? > > /* > * Check the xid entry status, in case something in the ipc > * communication doesn't work correctly. > */ > if (!((result->nHolding > 0) && (result->holders[lockmode] > 0))) > { > XID_PRINT_AUX("LockAcquire: INCONSISTENT ", result); > LOCK_PRINT_AUX("LockAcquire: INCONSISTENT ", lock, lockmode); > /* Should we retry ? */ > SpinRelease(masterLock); <<<<<<<<<<<< just added by me > return FALSE; > } > This is the third time I came here and it was always caused by other bugs. Regards, Hiroshi Inoue Inoue@tpf.co.jp
В списке pgsql-hackers по дате отправления: