Обсуждение: Re: [ADMIN] Replication slots and isolation levels
27 окт. 2015 г., в 19:45, Vladimir Borodin <root@simply.name> написал(а):Hi all.I’m wondering why do I get conflicts with recovery on hot standby using replication slots and read commited isolation level? And if I start repeatable read transaction I don’t get any errors. Below is some diagnostics.
+hackers@
Could anybody explain, why this is happening?
I’m using 9.4.4 (upgrade is planned) from yum.postgresql.org packages on both master and standby. Configs are the same on both master and standby:rpopdb01d/postgres M # SELECT name, setting FROM pg_settingsWHERE category LIKE 'Replication%' or category LIKE 'Write-Ahead Log';name | setting------------------------------+---------hot_standby | onhot_standby_feedback | onmax_replication_slots | 1max_standby_archive_delay | 30000max_standby_streaming_delay | 30000max_wal_senders | 10synchronous_standby_names |vacuum_defer_cleanup_age | 200000wal_keep_segments | 64wal_receiver_status_interval | 1wal_receiver_timeout | 60000wal_sender_timeout | 3000(12 rows)Time: 1.583 msrpopdb01d/postgres M #On the master I’ve created a physical replication slot and attached standby to it, I do see changing xmin and restart_lsn fields in pg_replication_slots view.rpopdb01d/postgres M # select * from pg_replication_slots ;slot_name | plugin | slot_type | datoid | database | active | xmin | catalog_xmin | restart_lsn----------------------+--------+-----------+--------+----------+--------+------------+--------------+---------------rpopdb01e_domain_com | [null] | physical | [null] | [null] | t | 2127399287 | [null] | 960B/415C79C8(1 row)Time: 0.463 msrpopdb01d/postgres M #When I start a read commited transaction on standby (or use autocommit mode, doesn’t matter) I still see that xmin in pg_replication_slots view on master increases. If I do run a heavy SELECT statement, at some point of time (presumably after vacuum_defer_cleanup_age expires) standby starts to lag replication apply and when it hits max_standby_streaming_delay I get 40001 sql code, either ERROR or FATAL:rpopdb01e/rpopdb R # SHOW transaction_isolation ;transaction_isolation-----------------------read committed(1 row)Time: 0.324 msrpopdb01e/rpopdb R # SELECT count(*) FROM big_table;ERROR: 40001: canceling statement due to conflict with recoveryDETAIL: User query might have needed to see row versions that must be removed.LOCATION: ProcessInterrupts, postgres.c:2990Time: 199791.339 msrpopdb01e/rpopdb R #rpopdb01e/rpopdb R # SHOW transaction_isolation ;transaction_isolation-----------------------read committed(1 row)Time: 0.258 msrpopdb01e/rpopdb R # BEGIN;BEGINTime: 0.067 msrpopdb01e/rpopdb R # SELECT count(*) FROM big_table;FATAL: 40001: terminating connection due to conflict with recoveryDETAIL: User was holding a relation lock for too long.LOCATION: ProcessInterrupts, postgres.c:2857server closed the connection unexpectedlyThis probably means the server terminated abnormallybefore or while processing the request.The connection to the server was lost. Attempting reset: Succeeded.Time: 307864.830 msrpopdb01e/rpopdb R #The behavior is the same as expected to be without using replication slots.But when I start repeatable read transaction xmin field in pg_replication_slots view on master freezes (while restart_lsn is still increasing) and I don’t get any replication lag and conflicts with recovery. When I end this transaction, xmin starts increasing again.rpopdb01e/rpopdb R # begin transaction isolation level repeatable read;BEGINTime: 0.118 msrpopdb01e/rpopdb R # SELECT count(*) FROM big_table;count------------3106222429(1 row)Time: 411944.889 msrpopdb01e/rpopdb R # ROLLBACK;ROLLBACKTime: 0.269 msrpopdb01e/rpopdb R #And that is what I expect. Am I missing something or is it expected behavior in read commited mode?Thanks in advance.
On Thu, Oct 29, 2015 at 9:42 AM, Vladimir Borodin wrote: > I’m wondering why do I get conflicts with recovery on hot standby using > replication slots and read commited isolation level? And if I start > repeatable read transaction I don’t get any errors. Below is some > diagnostics. In the case of repeatable read the standby will wait before applying the VACUUM WAL record cleaning up a relation page. Hence you won't get conflicts in this case. -- Michael
29 окт. 2015 г., в 13:12, Michael Paquier <michael.paquier@gmail.com> написал(а):On Thu, Oct 29, 2015 at 9:42 AM, Vladimir Borodin wrote:I’m wondering why do I get conflicts with recovery on hot standby using
replication slots and read commited isolation level? And if I start
repeatable read transaction I don’t get any errors. Below is some
diagnostics.
In the case of repeatable read the standby will wait before applying
the VACUUM WAL record cleaning up a relation page. Hence you won't get
conflicts in this case.
Standby will receive but will not apply? Or master will not vacuum needed by standby pages? It seems that the second one is happening because replication lag on standby does not increase while issuing such repeatable read transaction.
--
Michael
--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
On Thu, Oct 29, 2015 at 11:33 AM, Vladimir Borodin wrote: > 29 окт. 2015 г., в 13:12, Michael Paquier написал(а): >> In the case of repeatable read the standby will wait before applying >> the VACUUM WAL record cleaning up a relation page. Hence you won't get >> conflicts in this case. > > Standby will receive but will not apply? Or master will not vacuum needed by > standby pages? It seems that the second one is happening because replication > lag on standby does not increase while issuing such repeatable read > transaction. Standby will receive the record but not replay it until the transaction doing REPEATABLE READ transactions that needs those rows commits on the standby. The WAL flush position on the standby continues to move on. This depends of course on max_standby_streaming_delay which may decide or not to force the transaction to cancel if it takes too long. Someone feel free to correct me if I am missing something here. -- Michael
29 окт. 2015 г., в 14:03, Michael Paquier <michael.paquier@gmail.com> написал(а):On Thu, Oct 29, 2015 at 11:33 AM, Vladimir Borodin wrote:29 окт. 2015 г., в 13:12, Michael Paquier написал(а):In the case of repeatable read the standby will wait before applying
the VACUUM WAL record cleaning up a relation page. Hence you won't get
conflicts in this case.
Standby will receive but will not apply? Or master will not vacuum needed by
standby pages? It seems that the second one is happening because replication
lag on standby does not increase while issuing such repeatable read
transaction.
Standby will receive the record but not replay it until the
transaction doing REPEATABLE READ transactions that needs those rows
commits on the standby. The WAL flush position on the standby
continues to move on.
By replication lag on standby I mean exactly replay_location, not flush_location.
This depends of course on
max_standby_streaming_delay which may decide or not to force the
transaction to cancel if it takes too long. Someone feel free to
correct me if I am missing something here.
Well, the initial problem is that in read commited mode heavy SELECT-statement hits max_standby_streaming_delay but in repeatable read mode doesn’t. My question is if it is expected behavior? If yes, why is it so?
Thanks for your response!
--
Michael
On Thu, Oct 29, 2015 at 12:35 PM, Vladimir Borodin wrote: > 29 окт. 2015 г., в 14:03, Michael Paquier написал(а): >> Standby will receive the record but not replay it until the >> transaction doing REPEATABLE READ transactions that needs those rows >> commits on the standby. The WAL flush position on the standby >> continues to move on. > > By replication lag on standby I mean exactly replay_location, not > flush_location. > Well, the initial problem is that in read commited mode heavy > SELECT-statement hits max_standby_streaming_delay but in repeatable read > mode doesn’t. My question is if it is expected behavior? If yes, why is it > so? Er, well. If I enforce on master the deletion then VACUUM-cleanup of a page with a REPEATABLE READ transaction on standby still expecting to have this page items visible until its commit the startup process puts itself in waiting state when trying to replay the cleanup record, and the replay_location does not move on, still the wal receiver gets WAL in parallel, so it continues to flush things and flush_position progresses. With a READ COMMITTED transaction running on the standby, this transaction considers as visible stuff that has been committed, so WAL replay can move on, and indeed there is a risk to face a recovery conflict. So this behavior as-is is correct, based on how isolation levels should behave when a node performs recovery. -- Michael
29 окт. 2015 г., в 15:29, Michael Paquier <michael.paquier@gmail.com> написал(а):On Thu, Oct 29, 2015 at 12:35 PM, Vladimir Borodin wrote:29 окт. 2015 г., в 14:03, Michael Paquier написал(а):Standby will receive the record but not replay it until the
transaction doing REPEATABLE READ transactions that needs those rows
commits on the standby. The WAL flush position on the standby
continues to move on.
By replication lag on standby I mean exactly replay_location, not
flush_location.
Well, the initial problem is that in read commited mode heavy
SELECT-statement hits max_standby_streaming_delay but in repeatable read
mode doesn’t. My question is if it is expected behavior? If yes, why is it
so?
Er, well. If I enforce on master the deletion then VACUUM-cleanup of a
page with a REPEATABLE READ transaction on standby still expecting to
have this page items visible until its commit the startup process puts
itself in waiting state when trying to replay the cleanup record, and
the replay_location does not move on, still the wal receiver gets WAL
in parallel, so it continues to flush things and flush_position
progresses. With a READ COMMITTED transaction running on the standby,
this transaction considers as visible stuff that has been committed,
so WAL replay can move on, and indeed there is a risk to face a
recovery conflict. So this behavior as-is is correct, based on how
isolation levels should behave when a node performs recovery.
Everything you describe is exactly true for setups without replication slots. And the ability to run heavy SELECT statements on hot standby without replication lag and recovery conflicts was the reason why I tried to use them. And the documentation [0] directly says that «Replication slots provide an automated way to ensure ... that the master does not remove rows which could cause a recovery conflict even when the standby is disconnected». My question is why is it true for REPEATABLE READ transactions but it doesn’t work for READ COMMITED queries? Seems, that «even when the standby is disconnected» is much stronger limitation and READ COMMITED should work fine, but it doesn’t.
If I understand right, with hot_standby_feedback = on standby tells the master xmin of the earliest transaction on standby. And autovacuum worker on master takes it into account when doing vacuum cleanup (because it can see it from pg_replications_slots view), AFAIK with GetOldestXmin function. But I don’t understand why with READ COMMITED transactions xmin in pg_replication_slots view on master continues to increase while with REPEATABLE READ xmin freezes until this transaction finishes.
--
Michael
--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
On 29 Oct 2015, at 14:39, Vladimir Borodin <root@simply.name> wrote:f I understand right, with hot_standby_feedback = on standby tells the master xmin of the earliest transaction on standby. And autovacuum worker on master takes it into account when doing vacuum cleanup (because it can see it from pg_replications_slots view), AFAIK with GetOldestXmin function. But I don’t understand why with READ COMMITED transactions xmin in pg_replication_slots view on master continues to increase while with REPEATABLE READ xmin freezes until this transaction finishes.
Could it be a consequence of how REPEATABLE READ transactions handle snapshots? With REPEATABLE READ the snapshot is acquired only once at the beginning of a transaction; a READ COMMITTED transaction re-evaluates its snapshot with each new command.
--
Oleksii
On Thu, Oct 29, 2015 at 3:29 PM, Oleksii Kliukin <alexk@hintbits.com> wrote: > Could it be a consequence of how REPEATABLE READ transactions handle > snapshots? With REPEATABLE READ the snapshot is acquired only once at the > beginning of a transaction; a READ COMMITTED transaction re-evaluates its > snapshot with each new command. I bet that's exactly it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
30 окт. 2015 г., в 14:30, Robert Haas <robertmhaas@gmail.com> написал(а):On Thu, Oct 29, 2015 at 3:29 PM, Oleksii Kliukin <alexk@hintbits.com> wrote:Could it be a consequence of how REPEATABLE READ transactions handle
snapshots? With REPEATABLE READ the snapshot is acquired only once at the
beginning of a transaction; a READ COMMITTED transaction re-evaluates its
snapshot with each new command.
I bet that's exactly it.
I still don’t fully understand why is it so (the problem occurs while running only one SELECT-statement in READ COMMITED so only one snapshot is taken), but if is expected behavior shouldn’t the documentation mention that using READ COMMITED (which is the default) you may still get conflicts with recovery while using replication slots?
On Fri, Oct 30, 2015 at 12:40 PM, Vladimir Borodin wrote: > On Thu, Oct 29, 2015 at 3:29 PM, Oleksii Kliukin wrote: >> Could it be a consequence of how REPEATABLE READ transactions handle >> snapshots? With REPEATABLE READ the snapshot is acquired only once at the >> beginning of a transaction; a READ COMMITTED transaction re-evaluates its >> snapshot with each new command. > > I still don’t fully understand why is it so (the problem occurs while > running only one SELECT-statement in READ COMMITED so only one snapshot is > taken), but if is expected behavior shouldn’t the documentation mention that > using READ COMMITTED (which is the default) you may still get conflicts with > recovery while using replication slots? Replication slots and hot_standby_feedback are two different unrelated concepts, slots being aimed at retaining WAL. I guess that's the origin of your confusion: http://www.postgresql.org/message-id/20150616192141.GD2626@alap3.anarazel.de -- Michael
On 2015-10-30 13:42:19 +0100, Michael Paquier wrote: > On Fri, Oct 30, 2015 at 12:40 PM, Vladimir Borodin wrote: > > On Thu, Oct 29, 2015 at 3:29 PM, Oleksii Kliukin wrote: > >> Could it be a consequence of how REPEATABLE READ transactions handle > >> snapshots? With REPEATABLE READ the snapshot is acquired only once at the > >> beginning of a transaction; a READ COMMITTED transaction re-evaluates its > >> snapshot with each new command. > > > > I still don’t fully understand why is it so (the problem occurs while > > running only one SELECT-statement in READ COMMITED so only one snapshot is > > taken), but if is expected behavior shouldn’t the documentation mention that > > using READ COMMITTED (which is the default) you may still get conflicts with > > recovery while using replication slots? > > Replication slots and hot_standby_feedback are two different unrelated > concepts, slots being aimed at retaining WAL. Uh. Slots also retain the xmin horizon if hot_standby_feedback is enabled on the standby? > I guess that's the origin of your confusion: > http://www.postgresql.org/message-id/20150616192141.GD2626@alap3.anarazel.de That just says what I said above, I don't see how this makes replication slots and hs feedback unrelated? Greetings, Andres Freund
On Fri, Oct 30, 2015 at 12:40 PM, Vladimir Borodin <root@simply.name> wrote: > I still don’t fully understand why is it so (the problem occurs while > running only one SELECT-statement in READ COMMITED so only one snapshot is > taken), but if is expected behavior shouldn’t the documentation mention that > using READ COMMITED (which is the default) you may still get conflicts with > recovery while using replication slots? Are you doing BEGIN / one or more SELECT statements / END? Or just a bare SELECT with no explicit transaction control? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
30 окт. 2015 г., в 16:04, Robert Haas <robertmhaas@gmail.com> написал(а):On Fri, Oct 30, 2015 at 12:40 PM, Vladimir Borodin <root@simply.name> wrote:I still don’t fully understand why is it so (the problem occurs while
running only one SELECT-statement in READ COMMITED so only one snapshot is
taken), but if is expected behavior shouldn’t the documentation mention that
using READ COMMITED (which is the default) you may still get conflicts with
recovery while using replication slots?
Are you doing BEGIN / one or more SELECT statements / END?
Or just a bare SELECT with no explicit transaction control?
I’ve tried two ways - bare SELECT in autocommit mode and BEGIN; SELECT; ROLLBACK. I first described the problem in thread on pgsql-admin@ [0], there is copy-paste from psql there, but during conversation initial description was lost.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
On Fri, Oct 30, 2015 at 9:49 AM, Vladimir Borodin <root@simply.name> wrote: > I’ve tried two ways - bare SELECT in autocommit mode and BEGIN; SELECT; > ROLLBACK. I first described the problem in thread on pgsql-admin@ [0], there > is copy-paste from psql there, but during conversation initial description > was lost. > > [0] > http://www.postgresql.org/message-id/7F74C5EA-6741-44FC-B6C6-E96F18D761FB@simply.name Hmm. That behavior seems unexpected to me, but I might be missing something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
2 нояб. 2015 г., в 23:37, Robert Haas <robertmhaas@gmail.com> написал(а):On Fri, Oct 30, 2015 at 9:49 AM, Vladimir Borodin <root@simply.name> wrote:I’ve tried two ways - bare SELECT in autocommit mode and BEGIN; SELECT;
ROLLBACK. I first described the problem in thread on pgsql-admin@ [0], there
is copy-paste from psql there, but during conversation initial description
was lost.
[0]
http://www.postgresql.org/message-id/7F74C5EA-6741-44FC-B6C6-E96F18D761FB@simply.name
Hmm. That behavior seems unexpected to me, but I might be missing something.
Me too. That’s why I started the thread. One small detail that might have a value is that the big table being queried is partitioned into 64 inhereted tables. Now I’m trying to write a simple script to reproduce the problem, but that is not so easy because AFAIK VACUUM on master should happen while single query on standby is running and it should vacuum those rows that have not been accessed by the query on standby yet.
On 2015-11-02 15:37:57 -0500, Robert Haas wrote: > On Fri, Oct 30, 2015 at 9:49 AM, Vladimir Borodin <root@simply.name> wrote: > > I’ve tried two ways - bare SELECT in autocommit mode and BEGIN; SELECT; > > ROLLBACK. I first described the problem in thread on pgsql-admin@ [0], there > > is copy-paste from psql there, but during conversation initial description > > was lost. > > > > [0] > > http://www.postgresql.org/message-id/7F74C5EA-6741-44FC-B6C6-E96F18D761FB@simply.name > > Hmm. That behavior seems unexpected to me, but I might be missing something. The conflict is because of a relation lock, not because of visibility. Hot-Standby feedback changes nothing about that. I presume all the other conflicts are all because of relation level locks? Check pg_stat_database_conflicts and the server logs to verify. Andres
3 нояб. 2015 г., в 11:38, Andres Freund <andres@anarazel.de> написал(а):On 2015-11-02 15:37:57 -0500, Robert Haas wrote:On Fri, Oct 30, 2015 at 9:49 AM, Vladimir Borodin <root@simply.name> wrote:I’ve tried two ways - bare SELECT in autocommit mode and BEGIN; SELECT;
ROLLBACK. I first described the problem in thread on pgsql-admin@ [0], there
is copy-paste from psql there, but during conversation initial description
was lost.
[0]
http://www.postgresql.org/message-id/7F74C5EA-6741-44FC-B6C6-E96F18D761FB@simply.name
Hmm. That behavior seems unexpected to me, but I might be missing something.
The conflict is because of a relation lock, not because of
visibility. Hot-Standby feedback changes nothing about that.
I presume all the other conflicts are all because of relation level
locks? Check pg_stat_database_conflicts and the server logs to verify.
Oh, good point, thank you, it gives the answer. Actually I’ve already done a switchover in this cluster, so pg_stat_database_conflicts started from scratch :( But the logs haven’t been rotated yet:
root@rpopdb01e ~ # fgrep -e 562f9ef0.23df,6 -e 562fa107.451a -e 562fa1d9.5146 -e 562f9ef0.23df,10 -e 562fa259.56d1 /var/lib/pgsql/9.4/data/pg_log/postgresql-2015-10-27_185736.csv
2015-10-27 19:06:28.656 MSK,,,9183,,562f9ef0.23df,6,,2015-10-27 18:57:36 MSK,,0,LOG,00000,"parameter ""hot_standby_feedback"" changed to ""off""",,,,,,,,,""
2015-10-27 19:10:05.039 MSK,"postgres","rpopdb",17690,"[local]",562fa107.451a,1,"",2015-10-27 19:06:31 MSK,12/54563,0,ERROR,40001,"canceling statement due to conflict with recovery","User query might have needed to see row versions that must be removed.",,,,,"select count(*) from rpop.rpop_imap_uidls;",,,"psql"
2015-10-27 19:10:05.995 MSK,"monitor","rpopdb",20806,"localhost:51794",562fa1d9.5146,1,"",2015-10-27 19:10:01 MSK,15/24192,0,ERROR,40001,"canceling statement due to conflict with recovery","User was holding shared buffer pin for too long.",,,,"SQL function ""to_timestamp"" statement 1","select cnt from monitor.bad_rpop_total",,,""
2015-10-27 19:12:06.878 MSK,,,9183,,562f9ef0.23df,10,,2015-10-27 18:57:36 MSK,,0,LOG,00000,"parameter ""hot_standby_feedback"" changed to ""on""",,,,,,,,,""
2015-10-27 19:17:57.056 MSK,"postgres","rpopdb",22225,"[local]",562fa259.56d1,1,"",2015-10-27 19:12:09 MSK,3/35442,0,FATAL,40001,"terminating connection due to conflict with recovery","User was holding a relation lock for too long.",,,,,"select count(*) from rpop.rpop_imap_uidls;",,,"psql"
root@rpopdb01e ~ #
So FATAL is due to relation lock and one ERROR is due to pinned buffers (this is actually from another user) but there is also one ERROR due to old snapshots (first line). But I actually turned off hs_feedback before first ERROR and turned it on after it. So it seems to work expectedly.
Does it actually mean that I could get such conflicts (due to relation locks, for example) even in repeatable read or serializable? I mean, is there any dependency between transaction isolation level on standby and conflicts with recovery?
And am I right that the only way not to have confl_lock is to increase max_standby_streaming_delay?
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers