Re: pg15b4: FailedAssertion("TransactionIdIsValid(xmax)
От | Justin Pryzby |
---|---|
Тема | Re: pg15b4: FailedAssertion("TransactionIdIsValid(xmax) |
Дата | |
Msg-id | 20220912022758.GD31833@telsasoft.com обсуждение исходный текст |
Ответ на | Re: pg15b4: FailedAssertion("TransactionIdIsValid(xmax) (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: pg15b4: FailedAssertion("TransactionIdIsValid(xmax)
|
Список | pgsql-hackers |
On Mon, Sep 12, 2022 at 02:25:48PM +1200, Thomas Munro wrote: > On Mon, Sep 12, 2022 at 1:42 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Mon, Sep 12, 2022 at 10:44:38AM +1200, Thomas Munro wrote: > > > On Sat, Sep 10, 2022 at 5:44 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > > > < 2022-09-09 19:37:25.835 CDT telsasoft >ERROR: MultiXactId 133553154 has not been created yet -- apparent wraparound > > > > > > I guess what happened here is that after one of your (apparently > > > several?) OOM crashes, crash recovery didn't run all the way to the > > > true end of the WAL due to the maintenance_io_concurrency=0 bug. In > > > the case you reported, it couldn't complete an end-of-recovery > > > checkpoint until you disabled recovery_prefetch, but that's only > > > because of the somewhat unusual way that vismap pages work. In > > > another case it might have been able to (bogusly) complete a > > > checkpoint, leaving things in an inconsistent state. > > > > I think you're saying is that this can be explained by the > > io_concurrency bug in recovery_prefetch, if run under 15b3. > > Well I don't know, but it's one way I could think of that you could > have a data page referring to a multixact that isn't on disk after > recovery (because the data page happens to have been flushed, but we > didn't replay the WAL that would create the multixact). > > > But yesterday I started from initdb and restored this cluster from backup, and > > started up sqlsmith, and sent some kill -9, and now got more corruption. > > Looks like it took ~10 induced crashes before this happened. > > $SUBJECT says 15b4, which doesn't have the fix. Are you still using > maintainance_io_concurrent=0? Yeah ... I just realized that I've already forgotten the relevant chronology. The io_concurrency bugfix wasn't included in 15b4, so (if I understood you correctly), that might explain these symptoms - right ? -- Justin
В списке pgsql-hackers по дате отправления: