Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
От | Thomas Munro |
---|---|
Тема | Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated) |
Дата | |
Msg-id | CAEepm=1JnUCc1J0Rz5vuD=RjoCRvA7-KwzG7gr75H8knHxwyFg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated) (Thomas Munro <thomas.munro@enterprisedb.com>) |
Ответы |
Re: Re: BUG #12990: Missing pg_multixact/members files
(appears to have wrapped, then truncated)
|
Список | pgsql-bugs |
On Mon, May 11, 2015 at 2:45 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Sun, May 10, 2015 at 9:41 AM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> On Sun, May 10, 2015 at 12:43 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>> OK. So the next question is: if you then apply the other patch, does that prevent step 4 and thereby avoid catastrophe? >> >> Yes, in a quick test, at step 4 I couldn't proceed. I need to prod >> this some more on Monday, and also see how it interacts with >> autovacuum's view of what work needs to be done. > > The code in master which handles regular autovacuums seems correct > with this patch, because it measures member space usage by calling > find_multixact_start itself with the oldest multixact ID (it's not > dependent on anything that is updated at checkpoint time). > > The code in the patch at > http://www.postgresql.org/message-id/CA+TgmobbaQpE6sNqT30+rz4UMH5aPraq20gko5xd2ZGajz1-Jg@mail.gmail.com > would become wrong though, because it would use the (new) variable > MultiXactState->oldestOffset (set at checkpoint) to measure the used > member space. That means it would repeatedly launch autovacuums, even > after clearing away the problem and advancing the oldest multixact ID, > until the next checkpoint updates that value. In other words, it > can't see its own progress immediately (which is the right approach > for blocking new multixact generation, ie defer until > checkpoint/truncation, but the wrong approach for triggering > autovacuums). > > I think vacuum (SetMultiXactIdLimit?) needs to update oldestOffset, > not checkpoint (DetermineSafeOldestOffset). (The reason for wanting > this new value in shared memory is because GetNextMultiXactId needs to > be able to check it cheaply for every call, so calling > find_multixact_start every time would presumably not fly). Here's a new version of the patch to do that. As before, it tracks the oldest offset in shared memory, but now that is updated in SetMultiXactIdLimit, so it is always updated at the same time as MultiXactState->oldestMultiXactId (at startup and after full scan vacuums). The value is used in the following places: 1. GetNewMultiXactId uses it to see if it needs to send PMSIGNAL_START_AUTOVAC_LAUNCHER to request autovacuums even if autovacuum is set to off. That is the main purpose of this patch. (GetNewMultiXactId *doesn't* use it for member wraparound prevention: that is based on offsetStopLimit, set by checkpoint code after truncation of physical storage.) 2. SetMultiXactIdLimit itself also uses it to send a PMSIGNAL_START_AUTOVAC_LAUNCHER signal to the postmaster (according to comments this allows immediately doing some more vacuuming upon completion if necessary). 3. ReadMultiXactCounts, called by regular vacuum and autovacuum, rather than doing its own call to find_multixact_start, now also reads it from shared memory. (Incidentally the code this replaces suffers from the problem fixed elsewhere it can call find_multixact_start for a multixact that doesn't exist yet). Vacuum runs as expected with with autovacuum off. Do you think we should be using MULTIXACT_MEMBER_DANGER_THRESHOLD as the trigger level for forced vacuums instead of MULTIXACT_MEMBER_SAFE_THRESHOLD, or something else? -- Thomas Munro http://www.enterprisedb.com
Вложения
В списке pgsql-bugs по дате отправления: