Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica.
От | Maxim Boguk |
---|---|
Тема | Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica. |
Дата | |
Msg-id | CAK-MWwRJ7APazAfs86DEBm0AinB3Lb-EXBAzZZGQUq6wHL9+Ow@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #13657: Some kind of undetected deadlock between query and "startup process" on replica. (Maxim Boguk <maxim.boguk@gmail.com>) |
Список | pgsql-bugs |
On Fri, Oct 2, 2015 at 6:52 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote: > > > On Fri, Oct 2, 2015 at 4:58 PM, Michael Paquier <michael.paquier@gmail.co= m > > wrote: > >> >> >> On Fri, Oct 2, 2015 at 2:14 PM, Maxim Boguk <maxim.boguk@gmail.com> >> wrote: >> >>> =E2=80=8B>=E2=80=8B >>> This backtrace is not indicating that this process is waiting on a >>> relation lock, it is resolving a recovery conflict while removing tuple= s, >>> killing the virtual transaction depending on if max_standby_streaming_d= elay >>> or max_standby_archive_delay are set if the conflict gets longer. Did y= ou >>> change the default of those parameters, which is 30s, to -1? This would >>> mean that the standby waits indefinitely. >>> >>> >>> =E2=80=8BProblem that startup process have confict with a query, which = blocked >>> (waiting for) on the startup process itself (query could not process >>> because it waiting for lock which held by startup process, and startup >>> process waiting for finishing this query). So it's an undetected deadlo= ck >>> condtion here (as I understand situation). =E2=80=8B >>> >>> PS: there are no other activity on the database during that problem >>> except blocked query. >>> >> >> Don't you have other queries running in parallel of the one you are >> defining as "stuck" on the standby that prevent replay to move on? Like = a >> long-running transaction working on the relation involved? Are you sure >> that you did not set up >> =E2=80=8B=E2=80=8B >> max_standby_streaming_delay to -1? >> -- >> Michael >> > > During the problem period on the database had runned only one query > (listed in intial report) and nothing more (and this query had beed in > waiting state according to pg_stat_activity). > The pg_locks show that the query waiting for AccessShareLock on relation > 17987, in the same time the startup process have AccessExclusiveLock on t= he > same relation and waiting for something. No other activity on the replica > going on. > And yes, the=E2=80=8B max_standby_streaming_delay to -1, as a result the > replication process had been stuck on query from external monitoring tool > forever until I killed that query, but situation repeated in few hours > again. > > =E2=80=8BOne last addition to the report. The same database replica (server= ), the same monitoring queries, all the same but using PostgreSQL =E2=80=8B9.2.13 = doesn't have that problem. It been used for a few years without such issues, problem only arise only after whole cluster upgrade to version 9.4.4. > -- > Maxim Boguk >
В списке pgsql-bugs по дате отправления: