Re: MultiXact truncation, startup et al.
От | Andres Freund |
---|---|
Тема | Re: MultiXact truncation, startup et al. |
Дата | |
Msg-id | 20131127224213.GK31748@awork2.anarazel.de обсуждение исходный текст |
Ответ на | MultiXact truncation, startup et al. (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: MultiXact truncation, startup et al.
|
Список | pgsql-hackers |
Hi, On 2013-11-21 20:38:47 +0100, Andres Freund wrote: > Turns out, we don't ever truncate pg_multixact during recovery since > 9dc842f0832fd71eda826349a0c17ecf8ae93b84 because multixact truncations, > in contrast to clog, aren't WAL logged themselves. Disabling probably > was fair game back then since it wasn't too likely to remain in crash > recovery forever. > But at the very least since the addition of Hot Standby that's really > not the case anymore. If I calculate correctly currently you'd end up > with ~34GB(<9.3)/38GB of pg_multixact which seems a bit much. > > I am not 100% sure, but it looks like things could actually continue to > work despite having an slru wraparound into existing data. But that's > certainly nothing I'd want to rely on and looks mostly like lucky > happenstance, especially in 9.3. > > If this were a master only issue, I'd say WAL-logging mxact truncation > would be the way to go, but we can't really do that in the back branches > since multixact_redo() would throw a fit if we were to introduce a new > type of wal record and somebody would upgrade a primary first. > > So, what I think we need to do is to split StartupMultiXact() into two > parts, StartupMultiXact() which only sets the offset's, members's > shared->latest_page_number and TrimMultiXact() which does the remainder > of the work, executed when finishing crash recovery at the current > location of StartupMultiXact(). So, I've done this for 9.3+ for now. Testing around that turned up that our current way to schedule anti mxid wraparounds doesn't really work: 1) autovacuum.c knows about such vacuums, but vacuum.c doesn't. Leading to a long cycle of partial vacuums that don't increase relminmxid. 2) Parts of the code used 200mio as a hardcoded constant, others used autovacuum_freeze_max_age. 0001 fixes the vacuum scheduling and is applicable to 9.3+, 0002 re-adds pg_multixact truncation during crash recovery. The current code will only work on 9.3+, but if it's deemed acceptable I can backport it to earlier versions. I am not sure if it's worth backporting it 9.0 given it has neither HS nor SR? 0003 is a debugging only patch adding the useful pg_burn_multixact(num) function to pageinspect (plus some core changes to make that fast) and allows for low autovacuum_freeze_max_age settings. Not sure if it's really worth adding MultiXactIdPrecedesOrEquals in 0002, but I didn't want to differ in the scan_all logic normal xids ids and mxids. I think it'd also be fine to change the logic for xids to use TransactionIdPrecedes(), but I didn't want to touch that logic unnecessarily. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
В списке pgsql-hackers по дате отправления: