Re: Assertion failure when streaming logical changes
От | Heikki Linnakangas |
---|---|
Тема | Re: Assertion failure when streaming logical changes |
Дата | |
Msg-id | 54DAA7A8.5000705@vmware.com обсуждение исходный текст |
Ответ на | Re: Assertion failure when streaming logical changes (Andres Freund <andres@2ndquadrant.com>) |
Ответы |
Re: Assertion failure when streaming logical changes
|
Список | pgsql-hackers |
On 02/10/2015 02:46 PM, Andres Freund wrote: > On 2015-02-10 21:20:37 +0900, Michael Paquier wrote: >> Using test_decoding on HEAD (cc761b1) I am seeing the following assertion >> failure: >> TRAP: FailedAssertion("!(!((&RegisteredSnapshots)->ph_root == >> ((void*)0)))", File: "snapmgr.c", Line: 677) > > Ick. I guess a revert of 94028691609f8e148bd4ce72c46163f018832a5b fixes > it? > ... > > Yea, it really looks like the above commit is to "blame". The new xmin > tracking infrastructure doesn't know about the historical snapshot... The new xmin tracking code assumes that if a snapshots's regd_count > 0, it has already been pushed to the RegisteredSnapshots heap. That assumption doesn't hold because the logical decoding stuff creates its own snapshots and sets regd_count=1 to prevent snapmgr.c from freeing them, even though they haven't been registered with RegisterSnapshot. The second paragraph in this comment in snapmgr.c needs fixing: > * Likewise, any snapshots that have been exported by pg_export_snapshot > * have regd_count = 1 and are counted in RegisteredSnapshots, but are not > * tracked by any resource owner. > * > * The same is true for historic snapshots used during logical decoding, > * their lifetime is managed separately (as they life longer as one xact.c > * transaction). Besides the spelling, that's incorrect: historic snapshots are *not* counted in RegisteredSnapshots. That was harmless before commit 94028691, but it was always wrong. I think setting regd_count=1 outside snapmgr.c is a pretty ugly hack. snapbuild.c also abuses active_count as a reference counter, which is similarly ugly. I think regd_count and active_count should both be treated as private to snapmgr.c, and initialized to zero elsewhere. As a minimal fix, we could change the logical decoding code to not use regd_count to prevent snapmgr.c from freeing its snapshots, but use active_count for that too. Setting regd_count to 1 in SnapBuildBuildSnapshot() seems unnecessary in the first place, as the callers also call SnapBuildSnapIncRefcount(). Patch attached. But could we get rid of those active_count manipulations too? Could you replace the SnapBuildSnap[Inc|Dec]Refcount calls with [Un]RegisterSnapshot()? - Heikki
Вложения
В списке pgsql-hackers по дате отправления: