Re: Segfault while creating logical replication slots on active DB 14.6-1 + 15.1-1
От | Masahiko Sawada |
---|---|
Тема | Re: Segfault while creating logical replication slots on active DB 14.6-1 + 15.1-1 |
Дата | |
Msg-id | CAD21AoDXJd1Co9hC665CFUbj47_HGA0k4HdadOXGoPKyYK6ixQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Segfault while creating logical replication slots on active DB 14.6-1 + 15.1-1 (Alex Richman <alexrichman@onesignal.com>) |
Ответы |
Re: Segfault while creating logical replication slots on active DB 14.6-1 + 15.1-1
|
Список | pgsql-bugs |
Hi, On Tue, Jan 3, 2023 at 9:57 PM Alex Richman <alexrichman@onesignal.com> wrote: > > Apologies for the delay (and happy christmas/new years). > > Please find included a full backtrace[1] of a sample of this crash, replicated on postgres 15.1-1 in the same environmentdescribed in my original email. Included as a gist due to the length but lmk if it should be pasted in full forposterity. I've also added the python script[2] used to replicate, if that's relevant. > > Unfortunately we have not been able to reproduce this in a clean room environment, however we can note a few additionalthings: > - This has occurred over multiple distinct servers with different data sets, though similar write loads. Suggesting it'snot a specific server with data corruption. > - Disabling pg_repack, autovacuum, automatic reindexing, has no effect, the bug can still occur > - Running the same script on a read-only logical replica does not hit the bug > - As above, if the server is idle (no write traffic), then it does not hit the bug > - The bug occurs roughly 1 in every 10 executions of the create replication slot, so is not 100% consistent. > - We're fairly confident that this did not occur pre 14.5-1, and started occurring in 14.6-1 & 15.1-1. > So we would assume that there is some concurrent write traffic from our web tier that sometimes causes a segfault in thelogical replication slot creation. > > Please let me know if you need any more information. Thank you for providing more information. One possibility is that you encountered the bug in snapbuild.c that is already fixed by commit 898ef41bf6f4 and will be included in 14.7 and 15.2. I've attached patches of this fix for PG14 and PG15. Could you please try the same scenario again with these patches and see if the issue happens? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Вложения
В списке pgsql-bugs по дате отправления: