Re: [HACKERS] [TRAP: FailedAssertion] causing server to crash
От | Thomas Munro |
---|---|
Тема | Re: [HACKERS] [TRAP: FailedAssertion] causing server to crash |
Дата | |
Msg-id | CAEepm=0ZsP2HRZ9W-9pPftcwjz1BJZma2nuiFuPhACDMGSiprA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] [TRAP: FailedAssertion] causing server to crash (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: [HACKERS] [TRAP: FailedAssertion] causing server to crash
|
Список | pgsql-hackers |
On Thu, Aug 3, 2017 at 3:03 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jul 21, 2017 at 1:31 AM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> Thanks Neha. It's be best to post the back trace and if possible >> print oldestXact and ShmemVariableCache->oldestXid from the stack >> frame for TruncateCLOG. >> >> The failing assertion in TruncateCLOG() has a comment that says >> "vac_truncate_clog already advanced oldestXid", but vac_truncate_clog >> calls SetTransactionIdLimit() to write ShmemVariableCache->oldestXid >> *after* it calls TruncateCLOG(). What am I missing here? > > This problem was introduced by commit > ea42cc18c35381f639d45628d792e790ff39e271, so this should be added to > the PostgreSQL 10 open items list. That commit intended to introduce a > distinction between (1) the oldest XID that can be safely examined and > (2) the oldest XID that can't yet be safely reused. These are the > same except when we're in the middle of truncating CLOG: (1) advances > before the truncation, and (2) advances afterwards. That's why > AdvanceOldestClogXid() happens before truncation proper and > SetTransactionIdLimit() happens afterwards, and changing the order > would, I think, be quite wrong. Added to open items. > AFAICS, that assertion is simply a holdover from an earlier version of > the patch that escaped review. There's just no reason to suppose that > it's true. Craig, are you planning to post a patch to remove the assertion, or make it hold? >> What actually prevents ShmemVariableCache->oldestXid from going >> backwards anyway? Suppose there are two or more autovacuum processes >> that reach vac_truncate_clog() concurrently. They do a scan of >> pg_database whose tuples they access without locking through a >> pointer-to-volatile because they expect concurrent in-place writers, >> come up with a value for frozenXID, and then arrive at >> SetTransactionIdLimit() in whatever order and clobber >> ShmemVariableCache->oldestXid. What am I missing here? > > Hmm, there could be a bug there, but I don't think it's *this* bug. I'm not sure that it wrong per se, as long as no one asserts that the number can't go backwards... -- Thomas Munro http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: