Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
От | Amit Kapila |
---|---|
Тема | Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop |
Дата | |
Msg-id | CAA4eK1JmDGpr+Ouvj8x7u9RWVSLWyfsDiz2TRbrMNmdAqDpJ0g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop (Peter Smith <smithpb2250@gmail.com>) |
Ответы |
Re: BUG #16643: PG13 - Logical replication - initial startup never finishes and gets stuck in startup loop
|
Список | pgsql-bugs |
On Wed, Nov 18, 2020 at 8:18 AM Peter Smith <smithpb2250@gmail.com> wrote: > > On Wed, Nov 18, 2020 at 1:29 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > On 2020-Nov-04, Amit Kapila wrote: > > > > > On Thu, Oct 15, 2020 at 8:20 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > > > * STREAM COMMIT bug? > > > > In apply_handle_stream_commit, we do CommitTransactionCommand, but > > > > apparently in a tablesync worker we shouldn't do it. > > > > > > In the tablesync stage, we don't allow streaming. See pgoutput_startup > > > where we disable streaming for the init phase. As far as I understand, > > > for tablesync we create the initial slot during which streaming will > > > be disabled then we will copy the table (here logical decoding won't > > > be used) and then allow the apply worker to get any other data which > > > is inserted in the meantime. Now, I might be missing something here > > > but if you can explain it a bit more or share some test to show how we > > > can reach here via tablesync worker then we can discuss the possible > > > solution. > > > > Hmm, okay, that sounds like there would be no bug then. Maybe what we > > need is just an assert in apply_handle_stream_commit that > > !am_tablesync_worker(), as in the attached patch. Passes tests. > > Hi. > > Using the same debugging technique described in a previous mail [1], I > have tested again but this time using a SUBSCRIPTION capable of > streaming. > > While paused in the debugger (to force an unusual timing situation) I > can publish INSERTs en masse and cause streaming replication to occur. > > To cut a long story short, a tablesync worker CAN in fact end up > processing (e.g. apply_dispatch) streaming messages. > So the tablesync worker CAN get into the apply_handle_stream_commit. > And this scenario, albeit rare, will crash. > Thank you for reproducing this issue. Dilip, Peter, is anyone of you interested in writing a fix for this? -- With Regards, Amit Kapila.
В списке pgsql-bugs по дате отправления: