Обсуждение: canceling/terminating statement due to conflict with recovery in Replica/DR instances

Поиск

Список

Период

Сортировка

canceling/terminating statement due to conflict with recovery in Replica/DR instances

От

Ishan joshi

Дата:

30 сентября, 08:59:39

Hi Team,

We are using Postgresql 16.9 in production and with large database about 25TB of size. We have patroni setup with replica instance and DR patroni setup with patroni streaming.

We have high volume and frequent commit in the database. There are few large tables for which we asked client to execute queries on DR/Replica instances but these queries are start getting failed with "canceling statement due to conflict with recovery" and "terminating statement due to conflict with recovery" error.

As I understand the behavior is correct but we need to get rid of this issue.

I gone through the old posts and some documentation and got to know that below parameters can help to reduce this error.

max_standby_streaming_delay

max_standby_archive_delay

hot_standby_feedback = off

Our queries are running for long period that makes me to set this value to some minutes/hours (lets set 900s) which is not feasible for production as it will start impacting the replication lag. Also, the queries will fail if it reaches to mentioned thresholds.

If I set these parameters to "-1" (disable) then there will be direct impact on replication lag which will impact further queries on replica node and DR cluster.

Can you please guide If any other better solution present for such scenario?

Thanks & Regards,

Ishan Joshi

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

От

Laurenz Albe

Дата:

30 сентября, 09:16:53

On Tue, 2025-09-30 at 05:59 +0000, Ishan joshi wrote:
> We are using Postgresql 16.9 in production and with large database about 25TB
> of size. We have patroni setup with replica instance and DR patroni setup with
> patroni streaming.
>
> We have high volume and frequent commit in the database. There are few large
> tables for which we asked client to execute queries on DR/Replica instances but
> these queries are start getting failed with "canceling statement due to conflict
> with recovery" and "terminating statement due to conflict with recovery" error.
>
> As I understand the behavior is correct but we need to get rid of this issue.
>
> I gone through the old posts and some documentation and got to know that below
> parameters can help to reduce this error. 
>
> max_standby_streaming_delay 
> max_standby_archive_delay 
> hot_standby_feedback = off
>
> Our queries are running for long period that makes me to set this value to some
> minutes/hours (lets set 900s) which is not feasible for production as it will
> start impacting the replication lag. Also, the queries will fail if it reaches
> to mentioned thresholds.
>
> If I set these parameters to "-1" (disable) then there will be direct impact on
> replication lag which will impact further queries on replica node and DR cluster.
>
> Can you please guide If any other better solution present for such scenario?

No, there is no better solution.

You can reduce replication conflicts by turning on "hot_standby_feedback" and by
turning off "vacuum_truncate", but you probably won't be able to get rid of all
replication conflicts.

You can either have a small replay delay and canceled queries or no canceled
queries, but the occasional replay delay.

If you need both no delay and no canceled queries, the only clean solution is
to have two standby servers.

Yours,
Laurenz Albe

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

От

Peter Gram

Дата:

30 сентября, 10:58:32

Hi Laurenz

Thanks for all the answers you give on this list.

Could you elaborate on why two or more standby servers would help in this case ?

Med venlig hilsen

Peter Gram
Sæbyholmsvej 18

2500 Valby

Mobile: (+45) 5374 7107

Email: peter.m.gram@gmail.com

On Tue, 30 Sept 2025 at 08:17, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Tue, 2025-09-30 at 05:59 +0000, Ishan joshi wrote:
> We are using Postgresql 16.9 in production and with large database about 25TB
> of size. We have patroni setup with replica instance and DR patroni setup with
> patroni streaming.
>
> We have high volume and frequent commit in the database. There are few large
> tables for which we asked client to execute queries on DR/Replica instances but
> these queries are start getting failed with "canceling statement due to conflict
> with recovery" and "terminating statement due to conflict with recovery" error.
>
> As I understand the behavior is correct but we need to get rid of this issue.
>
> I gone through the old posts and some documentation and got to know that below
> parameters can help to reduce this error.
>
> max_standby_streaming_delay
> max_standby_archive_delay
> hot_standby_feedback = off
>
> Our queries are running for long period that makes me to set this value to some
> minutes/hours (lets set 900s) which is not feasible for production as it will
> start impacting the replication lag. Also, the queries will fail if it reaches
> to mentioned thresholds.
>
> If I set these parameters to "-1" (disable) then there will be direct impact on
> replication lag which will impact further queries on replica node and DR cluster.
>
> Can you please guide If any other better solution present for such scenario?

No, there is no better solution.

You can reduce replication conflicts by turning on "hot_standby_feedback" and by
turning off "vacuum_truncate", but you probably won't be able to get rid of all
replication conflicts.

You can either have a small replay delay and canceled queries or no canceled
queries, but the occasional replay delay.

If you need both no delay and no canceled queries, the only clean solution is
to have two standby servers.

Yours,
Laurenz Albe

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

От

Laurenz Albe

Дата:

30 сентября, 12:40:45

On Tue, 2025-09-30 at 09:58 +0200, Peter Gram wrote:
> On Tue, 30 Sept 2025 at 08:17, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Tue, 2025-09-30 at 05:59 +0000, Ishan joshi wrote:
> > > There are few large
> > > tables for which we asked client to execute queries on DR/Replica instances but
> > > these queries are start getting failed with "canceling statement due to conflict
> > > with recovery" and "terminating statement due to conflict with recovery" error.
> > >
> > > As I understand the behavior is correct but we need to get rid of this issue.
> > >
> > > I gone through the old posts and some documentation and got to know that below
> > > parameters can help to reduce this error. 
> > >
> > > max_standby_streaming_delay 
> > > max_standby_archive_delay 
> > > hot_standby_feedback = off
> > >
> > > Our queries are running for long period that makes me to set this value to some
> > > minutes/hours (lets set 900s) which is not feasible for production as it will
> > > start impacting the replication lag. Also, the queries will fail if it reaches
> > > to mentioned thresholds.
> > >
> > > If I set these parameters to "-1" (disable) then there will be direct impact on
> > > replication lag which will impact further queries on replica node and DR cluster.
> > >
> > > Can you please guide If any other better solution present for such scenario?
> >
> > No, there is no better solution.
> >
> > If you need both no delay and no canceled queries, the only clean solution is
> > to have two standby servers.
>
> Could you elaborate on why two or more standby servers would help in this case ?

One of the standby servers would have "max_standby_streaming_delay = 0" or
"hot_standby = off", that one would be for high availability.

The other one would have "max_standby_streaming_delay = -1" and would be used for
queries.

Yours,
Laurenz Albe

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

От

Imran Khan

Дата:

30 сентября, 15:36:48

Hi Isha,

I believe you have partitions and correct type of indexes created for those tables. Also, is this 25 TB size grown over many years or just few years old? Parameters tuning can help but won't be a permanent solution. Having multiple replicas I believe can make sense at this point.

Thanks,

Imran

On Tue, Sep 30, 2025, 8:59 AM Ishan joshi <ishanjoshi@live.com> wrote:

Hi Team,

We are using Postgresql 16.9 in production and with large database about 25TB of size. We have patroni setup with replica instance and DR patroni setup with patroni streaming.

We have high volume and frequent commit in the database. There are few large tables for which we asked client to execute queries on DR/Replica instances but these queries are start getting failed with "canceling statement due to conflict with recovery" and "terminating statement due to conflict with recovery" error.

As I understand the behavior is correct but we need to get rid of this issue.

I gone through the old posts and some documentation and got to know that below parameters can help to reduce this error.

max_standby_streaming_delay
max_standby_archive_delay
hot_standby_feedback = off

Our queries are running for long period that makes me to set this value to some minutes/hours (lets set 900s) which is not feasible for production as it will start impacting the replication lag. Also, the queries will fail if it reaches to mentioned thresholds.

If I set these parameters to "-1" (disable) then there will be direct impact on replication lag which will impact further queries on replica node and DR cluster.

Can you please guide If any other better solution present for such scenario?

Thanks & Regards,
Ishan Joshi

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

От

Ishan joshi

Дата:

30 сентября, 18:28:48

Hi Laurenz,

Thanks, for your explanations. It makes sense for having another replica instance but in our case, it is not possible to have another replica instance with huge database size.

We will see the impact with delaying the reply lag and act accordingly.

Thanks & Regards,

Ishan Joshi

From: Laurenz Albe <laurenz.albe@cybertec.at>
Sent: 30 September 2025 15:10
To: Peter Gram <peter.m.gram@gmail.com>
Cc: Ishan joshi <ishanjoshi@live.com>; pgsql-admin@postgresql.org <pgsql-admin@postgresql.org>
Subject: Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

On Tue, 2025-09-30 at 09:58 +0200, Peter Gram wrote:
> On Tue, 30 Sept 2025 at 08:17, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > On Tue, 2025-09-30 at 05:59 +0000, Ishan joshi wrote:
> > > There are few large
> > > tables for which we asked client to execute queries on DR/Replica instances but
> > > these queries are start getting failed with "canceling statement due to conflict
> > > with recovery" and "terminating statement due to conflict with recovery" error.
> > >
> > > As I understand the behavior is correct but we need to get rid of this issue.
> > >
> > > I gone through the old posts and some documentation and got to know that below
> > > parameters can help to reduce this error.
> > >
> > > max_standby_streaming_delay
> > > max_standby_archive_delay
> > > hot_standby_feedback = off
> > >
> > > Our queries are running for long period that makes me to set this value to some
> > > minutes/hours (lets set 900s) which is not feasible for production as it will
> > > start impacting the replication lag. Also, the queries will fail if it reaches
> > > to mentioned thresholds.
> > >
> > > If I set these parameters to "-1" (disable) then there will be direct impact on
> > > replication lag which will impact further queries on replica node and DR cluster.
> > >
> > > Can you please guide If any other better solution present for such scenario?
> >
> > No, there is no better solution.
> >
> > If you need both no delay and no canceled queries, the only clean solution is
> > to have two standby servers.
>
> Could you elaborate on why two or more standby servers would help in this case ?

One of the standby servers would have "max_standby_streaming_delay = 0" or
"hot_standby = off", that one would be for high availability.

The other one would have "max_standby_streaming_delay = -1" and would be used for
queries.

Yours,
Laurenz Albe

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

От

Ishan joshi

Дата:

30 сентября, 18:32:27

Hi Imran,

Thanks for your reply.

We have migrated from Oracle to Postgres these 25TB database. As the storage is huge we are not in position to create new replica instance/cluster.

Yes, I also believe the tuning the parameter is not long-term solution but we will check the impact and validate the same.

Thanks & Regards,

Ishan Joshi

From: Imran Khan <imran.k.23@gmail.com>
Sent: 30 September 2025 18:06
To: Ishan joshi <ishanjoshi@live.com>

Cc: pgsql-admin <pgsql-admin@postgresql.org>

Subject: Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

Hi Isha,

Thanks,

Imran

On Tue, Sep 30, 2025, 8:59 AM Ishan joshi <ishanjoshi@live.com> wrote:

Hi Team,

We are using Postgresql 16.9 in production and with large database about 25TB of size. We have patroni setup with replica instance and DR patroni setup with patroni streaming.

We have high volume and frequent commit in the database. There are few large tables for which we asked client to execute queries on DR/Replica instances but these queries are start getting failed with "canceling statement due to conflict with recovery" and "terminating statement due to conflict with recovery" error.

As I understand the behavior is correct but we need to get rid of this issue.

I gone through the old posts and some documentation and got to know that below parameters can help to reduce this error.

max_standby_streaming_delay
max_standby_archive_delay
hot_standby_feedback = off

Our queries are running for long period that makes me to set this value to some minutes/hours (lets set 900s) which is not feasible for production as it will start impacting the replication lag. Also, the queries will fail if it reaches to mentioned thresholds.

If I set these parameters to "-1" (disable) then there will be direct impact on replication lag which will impact further queries on replica node and DR cluster.

Can you please guide If any other better solution present for such scenario?

Thanks & Regards,
Ishan Joshi

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: canceling/terminating statement due to conflict with recovery in Replica/DR instances

canceling/terminating statement due to conflict with recovery in Replica/DR instances

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances

Re: canceling/terminating statement due to conflict with recovery in Replica/DR instances