Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica.
От | Maxim Boguk |
---|---|
Тема | Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica. |
Дата | |
Msg-id | CAK-MWwR9q1EKh5=R7oSPqHgqu-uOcVRfjpxm4eFM_Jao7N9s4Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica. (Michael Paquier <michael.paquier@gmail.com>) |
Ответы |
Re: BUG #13657: Some kind of undetected deadlock between query
and "startup process" on replica.
|
Список | pgsql-bugs |
On Fri, Oct 2, 2015 at 4:58 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > > > On Fri, Oct 2, 2015 at 2:14 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote= : > >> =E2=80=8B>=E2=80=8B >> This backtrace is not indicating that this process is waiting on a >> relation lock, it is resolving a recovery conflict while removing tuples= , >> killing the virtual transaction depending on if max_standby_streaming_de= lay >> or max_standby_archive_delay are set if the conflict gets longer. Did yo= u >> change the default of those parameters, which is 30s, to -1? This would >> mean that the standby waits indefinitely. >> >> >> =E2=80=8BProblem that startup process have confict with a query, which b= locked >> (waiting for) on the startup process itself (query could not process >> because it waiting for lock which held by startup process, and startup >> process waiting for finishing this query). So it's an undetected deadloc= k >> condtion here (as I understand situation). =E2=80=8B >> >> PS: there are no other activity on the database during that problem >> except blocked query. >> > > Don't you have other queries running in parallel of the one you are > defining as "stuck" on the standby that prevent replay to move on? Like a > long-running transaction working on the relation involved? Are you sure > that you did not set up > =E2=80=8B=E2=80=8B > max_standby_streaming_delay to -1? > -- > Michael > During the problem period on the database had runned only one query (listed in intial report) and nothing more (and this query had beed in waiting state according to pg_stat_activity). The pg_locks show that the query waiting for AccessShareLock on relation 17987, in the same time the startup process have AccessExclusiveLock on the same relation and waiting for something. No other activity on the replica going on. And yes, the=E2=80=8B max_standby_streaming_delay to -1, as a result the replication process had been stuck on query from external monitoring tool forever until I killed that query, but situation repeated in few hours again. --=20 Maxim Boguk Senior Postgresql DBA http://www.postgresql-consulting.ru/ <http://www.postgresql-consulting.com/= > Phone RU: +7 910 405 4718 Phone AU: +61 45 218 5678 LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b Skype: maxim.boguk Jabber: maxim.boguk@gmail.com =D0=9C=D0=BE=D0=B9=D0=9A=D1=80=D1=83=D0=B3: http://mboguk.moikrug.ru/ "People problems are solved with people. If people cannot solve the problem, try technology. People will then wish they'd listened at the first stage."
В списке pgsql-bugs по дате отправления: