Re: BUG #15290: Stuck Parallel Index Scan query
От | Thomas Munro |
---|---|
Тема | Re: BUG #15290: Stuck Parallel Index Scan query |
Дата | |
Msg-id | CAEepm=2fYdJ5hsrEb8OH=MCb1-adn8c0_rnTafdhKFcumL1vug@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #15290: Stuck Parallel Index Scan query (Victor Yegorov <vyegorov@gmail.com>) |
Ответы |
Re: BUG #15290: Stuck Parallel Index Scan query
|
Список | pgsql-bugs |
On Mon, Jul 23, 2018 at 7:57 PM, Victor Yegorov <vyegorov@gmail.com> wrote: > - `ERROR: canceling statement due to conflict with recovery`, happened > right when our problematic query started, same user Ok, so that would explain how the master was cancelled. In 2877's stack we see that it was aborting here: #11 0x00007f539697ba5e in PostgresMain (argc=1, argv=argv@entry=0x7f5398d1bbc8, dbname=0x7f5398d1bb98 "coub", username=0x7f5398d1bbb0 "app") at /build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/backend/tcop/postgres.c:3879 That line calls AbortCurrentTransaction(), just after the call to EmitErrorReport() that wrote something in your log. Andres's theory (interrupts 'held') seems promising... perhaps there could be a bug where parallel index scans leak a share-locked page or something like that. I tried to reproduce this a bit, but no cigar so far. I wonder if there could be something about your bloated index that reaches buggy behaviour... If you happen to have a core file for a worker that is waiting in ConditionVariableSleep(), or it happens again, you'd be able to see if an LWLock is causing this by printing num_held_lwlocks. -- Thomas Munro http://www.enterprisedb.com
В списке pgsql-bugs по дате отправления: