Обсуждение: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
Hello, My customer reported a strange connection hang problem. He and I couldn't reproduce it. I haven't been able to understand the cause, but I can think of one hypothesis. Could you give me your opinions on whether my hypothesis is correct, and a direction on how to fix the problem? I'm willing to submit a patch if necessary. [Problem] The customer is using synchronous streaming replication with PostgreSQL 9.2.8. The cluster consists of two nodes. He performed archive recovery test like this: 1. Take a base backup. At that time, some notable settings in postgresql.conf are: synchronous_standby_names = 'node2' autovacuum = on # synchronous_commit is commented out, so it's on by default 2. Some update operations. I don't know what. 3. Shutdown the primary and promote the standby. 4. Shutdown the new primary. 5. Perform archive recovery. That is, restore the base backup, create recovery.conf, and do pg_ctl start. 6. Immediately after the archive recovery is complete, connect to the database server and perform some queries to check user data. The steps 5 and 6 are done in some recovery script. However, the connection attempt in step 6 got stuck for 12 hours, and the test was canceled. The stack trace was: #0 0x0000003f4badf258 in poll () from /lib64/libc.so.6 #1 0x0000000000619b94 in WaitLatchOrSocket () #2 0x0000000000640c4c in SyncRepWaitForLSN () #3 0x0000000000491c18 in RecordTransactionCommit () #4 0x0000000000491d98 in CommitTransaction () #5 0x0000000000493135 in CommitTransactionCommand () #6 0x000000000074938a in InitPostgres () #7 0x000000000066ddd7 in PostgresMain () #8 0x0000000000627d81 in PostmasterMain () #9 0x00000000005c4803 in main () The connection attempt is waiting for a reply from the standby. This is strange, because we didn't anticipate that the connection establishment (and subsequent SELECT queries) would update something and write some WAL. The doc says: http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION "When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the transaction log on disk of both the primary and standby server. ... Read only transactions and transaction rollbacks need not wait for replies from standby servers. Subtransaction commits do not wait for responses from standby servers, only top-level commits." [Hypothesis] Why does the connection processing emit WAL? Probably, it did page-at-a-time vacuum during access to pg_database and pg_authid for client authentication. src/backend/access/heap/README.HOT describes: "Effectively, space reclamation happens during tuple retrieval when the page is nearly full (<10% free) and a buffer cleanup lock can be acquired. This means that UPDATE, DELETE, and SELECT can trigger space reclamation, but often not during INSERT ... VALUES because it does not retrieve a row." But the customer could not reproduce the problem when he performed the same archive recovery from the same base backup again. Why? I guess the autovacuum daemon vacuumed the system catalogs before he attempted to connect to the database. Is this correct? [How to fix] Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c synchronous_standby_names='" to pg_ctl start in the recovery script would prevent the problem. But isn't there anything to fix in PostgreSQL? I think the doc needs improvement so that users won't misunderstand that only write transactions would block at commit. Do you think something else should be done? I guess pg_basebackup, pg_isready, and PQping() called in pg_ctl -w start/restart would block likewise, and I'm afraid users don't anticipate it. pg_upgrade appears to set synchronous_commit to local when starting the database server. Regards MauMau
On Fri, Jul 4, 2014 at 7:29 PM, MauMau <maumau307@gmail.com> wrote:
>
> Hello,
>
> "When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the transaction log on disk of both the primary and standby server.
> ...
> Read only transactions and transaction rollbacks need not wait for replies from standby servers. Subtransaction commits do not wait for responses from standby servers, only top-level commits."
>
>
> [Hypothesis]
> Why does the connection processing emit WAL?
>
> Probably, it did page-at-a-time vacuum during access to pg_database and pg_authid for client authentication. src/backend/access/heap/README.HOT describes:
I agree with your analysis that it can happen during connection
>
> Hello,
>
> "When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the transaction log on disk of both the primary and standby server.
> ...
> Read only transactions and transaction rollbacks need not wait for replies from standby servers. Subtransaction commits do not wait for responses from standby servers, only top-level commits."
>
>
> [Hypothesis]
> Why does the connection processing emit WAL?
>
> Probably, it did page-at-a-time vacuum during access to pg_database and pg_authid for client authentication. src/backend/access/heap/README.HOT describes:
I agree with your analysis that it can happen during connection
attempt.
> But the customer could not reproduce the problem when he performed the same archive recovery from the same base backup again. Why? I guess the autovacuum daemon vacuumed the system catalogs before he attempted to connect to the database.
>
> Is this correct?
One way to confirm could be to perform the archive recovery by
> But the customer could not reproduce the problem when he performed the same archive recovery from the same base backup again. Why? I guess the autovacuum daemon vacuumed the system catalogs before he attempted to connect to the database.
>
> Is this correct?
One way to confirm could be to perform the archive recovery by
disabling autovacuum.
>
> [How to fix]
> Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c synchronous_standby_names='" to pg_ctl start in the recovery script would prevent the problem.
>
> But isn't there anything to fix in PostgreSQL? I think the doc needs improvement so that users won't misunderstand that only write transactions would block at commit.
I also think at the very least we should update docs even if we
>
> [How to fix]
> Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c synchronous_standby_names='" to pg_ctl start in the recovery script would prevent the problem.
>
> But isn't there anything to fix in PostgreSQL? I think the doc needs improvement so that users won't misunderstand that only write transactions would block at commit.
I also think at the very least we should update docs even if we
don't have any solution for this case.
Another thing which I am wondering about is can't the same happen
even for Read Only transaction (incase someone does Select which
prunes the page).
Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
От
Rajeev rastogi
Дата:
On 04 July 2014 19:29, MauMau Wrote: > [How to fix] > Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c > synchronous_standby_names='" to pg_ctl start in the recovery script > would prevent the problem. > > But isn't there anything to fix in PostgreSQL? I think the doc needs > improvement so that users won't misunderstand that only write > transactions would block at commit. As of now there is no solution for this in PostgreSQL but I had submitted a patch "Standalone synchronous master" in 9.4 2014-01 CommitFest, which was rejected because of some issues. This patch was meant to degrade the synchronous level of master, if all synchronous standbys are down. I plan to resubmit this with better design sometime in 9.5. Thanks and Regards, Kumar Rajeev Rastogi
On 2014-07-07 04:20:12 +0000, Rajeev rastogi wrote: > > On 04 July 2014 19:29, MauMau Wrote: > > > [How to fix] > > Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c > > synchronous_standby_names='" to pg_ctl start in the recovery script > > would prevent the problem. > > > > But isn't there anything to fix in PostgreSQL? I think the doc needs > > improvement so that users won't misunderstand that only write > > transactions would block at commit. > > As of now there is no solution for this in PostgreSQL but I had submitted a patch "Standalone synchronous master" in > 9.4 2014-01 CommitFest, which was rejected because of some issues. This patch was meant to degrade the synchronous > level of master, if all synchronous standbys are down. > I plan to resubmit this with better design sometime in 9.5. That seems to be more less orthogonal to the issue at hand. The problem here is that a readonly command lead to a wait. And even worse it was a command the user had no influence over. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi, On 2014-07-04 22:59:15 +0900, MauMau wrote: > My customer reported a strange connection hang problem. He and I couldn't > reproduce it. I haven't been able to understand the cause, but I can think > of one hypothesis. Could you give me your opinions on whether my hypothesis > is correct, and a direction on how to fix the problem? I'm willing to > submit a patch if necessary. > The connection attempt is waiting for a reply from the standby. This is > strange, because we didn't anticipate that the connection establishment (and > subsequent SELECT queries) would update something and write some WAL. The > doc says: > > http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION > > "When requesting synchronous replication, each commit of a write transaction > will wait until confirmation is received that the commit has been written to > the transaction log on disk of both the primary and standby server. > ... > Read only transactions and transaction rollbacks need not wait for replies > from standby servers. Subtransaction commits do not wait for responses from > standby servers, only top-level commits." > > > [Hypothesis] > Why does the connection processing emit WAL? > > Probably, it did page-at-a-time vacuum during access to pg_database and > pg_authid for client authentication. src/backend/access/heap/README.HOT > describes: > [How to fix] > Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c > synchronous_standby_names='" to pg_ctl start in the recovery script would > prevent the problem. > But isn't there anything to fix in PostgreSQL? I think the doc needs > improvement so that users won't misunderstand that only write transactions > would block at commit. I think we should rework RecordTransactionCommit() to only wait for the standby if `markXidCommitted' and not if `wrote_xlog'. There really isn't a reason to make a readonly transaction's commit wait just because it did some hot pruning. Greetings, Andres Freund
On Mon, Jul 7, 2014 at 4:14 PM, Andres Freund <andres@2ndquadrant.com> wrote: > Hi, > > On 2014-07-04 22:59:15 +0900, MauMau wrote: >> My customer reported a strange connection hang problem. He and I couldn't >> reproduce it. I haven't been able to understand the cause, but I can think >> of one hypothesis. Could you give me your opinions on whether my hypothesis >> is correct, and a direction on how to fix the problem? I'm willing to >> submit a patch if necessary. > >> The connection attempt is waiting for a reply from the standby. This is >> strange, because we didn't anticipate that the connection establishment (and >> subsequent SELECT queries) would update something and write some WAL. The >> doc says: >> >> http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION >> >> "When requesting synchronous replication, each commit of a write transaction >> will wait until confirmation is received that the commit has been written to >> the transaction log on disk of both the primary and standby server. >> ... >> Read only transactions and transaction rollbacks need not wait for replies >> from standby servers. Subtransaction commits do not wait for responses from >> standby servers, only top-level commits." >> >> >> [Hypothesis] >> Why does the connection processing emit WAL? >> >> Probably, it did page-at-a-time vacuum during access to pg_database and >> pg_authid for client authentication. src/backend/access/heap/README.HOT >> describes: > >> [How to fix] >> Of course, adding "-o '-c synchronous_commit=local'" or "-o '-c >> synchronous_standby_names='" to pg_ctl start in the recovery script would >> prevent the problem. > >> But isn't there anything to fix in PostgreSQL? I think the doc needs >> improvement so that users won't misunderstand that only write transactions >> would block at commit. > > I think we should rework RecordTransactionCommit() to only wait for the > standby if `markXidCommitted' and not if `wrote_xlog'. There really > isn't a reason to make a readonly transaction's commit wait just because > it did some hot pruning. Sounds good direction. One question is: Can RecordTransactionCommit() avoid waiting for not only replication but also local WAL flush safely in such read-only transaction case? Regards, -- Fujii Masao
Andres Freund <andres@2ndquadrant.com> writes: > I think we should rework RecordTransactionCommit() to only wait for the > standby if `markXidCommitted' and not if `wrote_xlog'. There really > isn't a reason to make a readonly transaction's commit wait just because > it did some hot pruning. Well, see the comment that explains why the logic is like this now: * If we didn't create XLOG entries, we're done here; otherwise we * should flush those entries the same asa commit record. (An * example of a possible record that wouldn't cause an XID to be * assigned is a sequenceadvance record due to nextval() --- we want * to flush that to disk before reporting commit.) I agree that HOT pruning isn't a reason to make a commit wait, but nextval() is. We could perhaps add more flags that would keep track of which sorts of xlog entries justify a wait at commit, but TBH I'm skeptical of the entire proposition. Having synchronous replication on with no live slave *will* result in arbitrary hangs, and the argument that this particular case should be exempt seems a bit thin to me. The sooner the user realizes he's got a problem, the better. If read-only transactions don't show a problem, the user might not realize he's got one until he starts to wonder why autovac/autoanalyze aren't working. I think a more useful line of thought would be to see if we can't complain more loudly when we have no synchronous standby. Perhaps a "WARNING: waiting forever for lack of a synchronous standby" could be emitted when a transaction starts to wait. regards, tom lane
On 2014-07-07 09:57:20 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > I think we should rework RecordTransactionCommit() to only wait for the > > standby if `markXidCommitted' and not if `wrote_xlog'. There really > > isn't a reason to make a readonly transaction's commit wait just because > > it did some hot pruning. > > Well, see the comment that explains why the logic is like this now: > > * If we didn't create XLOG entries, we're done here; otherwise we > * should flush those entries the same as a commit record. (An > * example of a possible record that wouldn't cause an XID to be > * assigned is a sequence advance record due to nextval() --- we want > * to flush that to disk before reporting commit.) I think we should 'simply' make sequences assign a toplevel xid - then we can get rid of that special case in RecordTransactionCommit(). And I think the performance benefit of not having to wait on XLogFlush() for readonly xacts due to hot prunes far outweighs the decrease due to the xid assignment/commit record. I don't think that nextval()s are called overly much without a later xid assigning statement. > I agree that HOT pruning isn't a reason to make a commit wait, but > nextval() is. Agreed. > We could perhaps add more flags that would keep track of which sorts of > xlog entries justify a wait at commit, but TBH I'm skeptical of the entire > proposition. Having synchronous replication on with no live slave *will* > result in arbitrary hangs, and the argument that this particular case > should be exempt seems a bit thin to me. The sooner the user realizes > he's got a problem, the better. If read-only transactions don't show a > problem, the user might not realize he's got one until he starts to wonder > why autovac/autoanalyze aren't working. Well, the user might just want to log in to diagnose the problem. If he can't even login to see pg_stat_replication it's a pretty screwed up situation. > I think a more useful line of thought would be to see if we can't complain > more loudly when we have no synchronous standby. Perhaps a "WARNING: > waiting forever for lack of a synchronous standby" could be emitted when > a transaction starts to wait. In the OP's case the session wasn't even started - so proper feedback isn't that easy... We could special case that by forcing s_c=off until the session started properly. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-07-07 09:57:20 -0400, Tom Lane wrote: >> Well, see the comment that explains why the logic is like this now: > I think we should 'simply' make sequences assign a toplevel xid - then > we can get rid of that special case in RecordTransactionCommit(). And I > think the performance benefit of not having to wait on XLogFlush() for > readonly xacts due to hot prunes far outweighs the decrease due to the > xid assignment/commit record. I don't think that nextval()s are called > overly much without a later xid assigning statement. Yeah, that could well be true. I'm not sure if there are any other cases where we have non-xid-assigning operations that are considered part of what has to be flushed before reporting commit; if there are not, I'd be okay with changing nextval() this way. >> I think a more useful line of thought would be to see if we can't complain >> more loudly when we have no synchronous standby. Perhaps a "WARNING: >> waiting forever for lack of a synchronous standby" could be emitted when >> a transaction starts to wait. > In the OP's case the session wasn't even started - so proper feedback > isn't that easy... Perhaps I'm wrong, but I think a WARNING emitted here would be seen in psql even though we're still in InitPostgres. If it isn't, we have a problem there anyhow, IMO. > We could special case that by forcing s_c=off until the session started properly. Ugh. regards, tom lane
On Tue, Jul 8, 2014 at 1:06 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andres Freund <andres@2ndquadrant.com> writes: >> On 2014-07-07 09:57:20 -0400, Tom Lane wrote: >>> Well, see the comment that explains why the logic is like this now: > >> I think we should 'simply' make sequences assign a toplevel xid - then >> we can get rid of that special case in RecordTransactionCommit(). And I >> think the performance benefit of not having to wait on XLogFlush() for >> readonly xacts due to hot prunes far outweighs the decrease due to the >> xid assignment/commit record. I don't think that nextval()s are called >> overly much without a later xid assigning statement. > > Yeah, that could well be true. I'm not sure if there are any other cases > where we have non-xid-assigning operations that are considered part of > what has to be flushed before reporting commit; Maybe pg_switch_xlog(). > if there are not, I'd > be okay with changing nextval() this way. +1 Regards, -- Fujii Masao
On 2014-07-07 12:06:14 -0400, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-07-07 09:57:20 -0400, Tom Lane wrote: > >> Well, see the comment that explains why the logic is like this now: > > > I think we should 'simply' make sequences assign a toplevel xid - then > > we can get rid of that special case in RecordTransactionCommit(). And I > > think the performance benefit of not having to wait on XLogFlush() for > > readonly xacts due to hot prunes far outweighs the decrease due to the > > xid assignment/commit record. I don't think that nextval()s are called > > overly much without a later xid assigning statement. > > Yeah, that could well be true. I'm not sure if there are any other cases > where we have non-xid-assigning operations that are considered part of > what has to be flushed before reporting commit; if there are not, I'd > be okay with changing nextval() this way. I'm not aware of any adhoc, but I think to actually change it someone would have to iterate over all wal record types to make sure. > >> I think a more useful line of thought would be to see if we can't complain > >> more loudly when we have no synchronous standby. Perhaps a "WARNING: > >> waiting forever for lack of a synchronous standby" could be emitted when > >> a transaction starts to wait. > > > In the OP's case the session wasn't even started - so proper feedback > > isn't that easy... > > Perhaps I'm wrong, but I think a WARNING emitted here would be seen in > psql even though we're still in InitPostgres. Yes, it is visible. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
From: "Amit Kapila" <amit.kapila16@gmail.com> > On Fri, Jul 4, 2014 at 7:29 PM, MauMau <maumau307@gmail.com> wrote: >> [Hypothesis] >> Why does the connection processing emit WAL? >> >> Probably, it did page-at-a-time vacuum during access to pg_database and > pg_authid for client authentication. src/backend/access/heap/README.HOT > describes: > > I agree with your analysis that it can happen during connection > attempt. Thank you. I'm relieved the cause seems correct. >> But the customer could not reproduce the problem when he performed the > same archive recovery from the same base backup again. Why? I guess the > autovacuum daemon vacuumed the system catalogs before he attempted to > connect to the database. >> > One way to confirm could be to perform the archive recovery by > disabling autovacuum. Yeah, I thought of that too. Unfortunately, the customer deleted the the base backup for testing. > Another thing which I am wondering about is can't the same happen > even for Read Only transaction (incase someone does Select which > prunes the page). I'm afraid about that, too. Regards MauMau
From: "Rajeev rastogi" <rajeev.rastogi@huawei.com> As of now there is no solution for this in PostgreSQL but I had submitted a patch "Standalone synchronous master" in 9.4 2014-01 CommitFest, which was rejected because of some issues. This patch was meant to degrade the synchronous level of master, if all synchronous standbys are down. I plan to resubmit this with better design sometime in 9.5. Although I only read some mails of that thread, I'm sure your proposal is what many people would appreciate. Your new operation mode is equivalent to the maximum availability mode of Oracle Data Guard, isn't it? I'm looking forward to it. Good luck. ================================================== Maximum availability This protection mode provides the highest level of data protection that is possible without compromising the availability of a primary database. Transactions do not commit until all redo data needed to recover those transactions has been written to the online redo log and to at least one standby database. If the primary database cannot write its redo stream to at least one standby database, it effectively switches to maximum performance mode to preserve primary database availability and operates in that mode until it is again able to write its redo stream to a standby database. This protection mode ensures zero data loss except in the case of certain double faults, such as failure of a primary database after failure of the standby database. Maximum performance This is the default protection mode. It provides the highest level of data protection that is possible without affecting the performance of a primary database. This is accomplished by allowing transactions to commit as soon as all redo data generated by those transactions has been written to the online log. Redo data is also written to one or more standby databases, but this is done asynchronously with respect to transaction commitment, so primary database performance is unaffected by delays in writing redo data to the standby database(s). This protection mode offers slightly less data protection than maximum availability mode and has minimal impact on primary database performance. ================================================== Regards MauMau
From: "Tom Lane" <tgl@sss.pgh.pa.us> > problem, the user might not realize he's got one until he starts to wonder > why autovac/autoanalyze aren't working. In autovacuum.c, autovacuum workers avoid waiting for the standby by doing: /* * Force synchronous replication off to allow regular maintenance even if * we are waiting for standbys to connect. Thisis important to ensure we * aren't blocked from performing anti-wraparound tasks. */if (synchronous_commit > SYNCHRONOUS_COMMIT_LOCAL_FLUSH)SetConfigOption("synchronous_commit", "local", PGC_SUSET, PGC_S_OVERRIDE); Regards MauMau
From: "Tom Lane" <tgl@sss.pgh.pa.us> > Andres Freund <andres@2ndquadrant.com> writes: >> On 2014-07-07 09:57:20 -0400, Tom Lane wrote: >>> Well, see the comment that explains why the logic is like this now: > >> I think we should 'simply' make sequences assign a toplevel xid - then >> we can get rid of that special case in RecordTransactionCommit(). And I >> think the performance benefit of not having to wait on XLogFlush() for >> readonly xacts due to hot prunes far outweighs the decrease due to the >> xid assignment/commit record. I don't think that nextval()s are called >> overly much without a later xid assigning statement. > > Yeah, that could well be true. I'm not sure if there are any other cases > where we have non-xid-assigning operations that are considered part of > what has to be flushed before reporting commit; if there are not, I'd > be okay with changing nextval() this way. Thank you all for letting me know your thoughts. I understood and agree that read-only transactions, including the connection establishment one, should not wait for the standby nor the XLOG flush at commit, and the current documentation/specification should not be changed. I'll consider how to fix this problem, learning the code, then I'll ask for review. I'd like to submit the patch for next CF if possible. From: "Fujii Masao" <masao.fujii@gmail.com> > Sounds good direction. One question is: Can RecordTransactionCommit() > avoid > waiting for not only replication but also local WAL flush safely in > such read-only > transaction case? I'd appreciate any opinion on this, too. Regards MauMau
Re: [RFC: bug fix?] Connection attempt block forever when the synchronous standby is not running
От
Alvaro Herrera
Дата:
FWIW a fix for this has been posted to all active branches: Author: Andres Freund <andres@anarazel.de> Branch: master [fd6a3f3ad] 2015-02-26 12:50:07 +0100 Branch: REL9_4_STABLE [d72115112] 2015-02-26 12:50:07 +0100 Branch: REL9_3_STABLE [abce8dc7d] 2015-02-26 12:50:07 +0100 Branch: REL9_2_STABLE [d67076529] 2015-02-26 12:50:07 +0100 Branch: REL9_1_STABLE [5c8dabecd] 2015-02-26 12:50:08 +0100 Branch: REL9_0_STABLE [82e0d6eb5] 2015-02-26 12:50:08 +0100 Reconsider when to wait for WAL flushes/syncrep during commit. Up to now RecordTransactionCommit() waited for WALto be flushed (if synchronous_commit != off) and to be synchronously replicated (if enabled), even if a transactiondid not have a xid assigned. The primary reason for that is that sequence's nextval() did not assign a xid,but are worthwhile to wait for on commit. This can be problematic because sometimes read only transactions do write WAL, e.g. HOT page prune records. That then could lead to read only transactions having to wait during commit.Not something people expect in a read only transaction. This lead to such strange symptoms as backends beingseemingly stuck during connection establishment when all synchronous replicas are down. Especially annoying whensaid stuck connection is the standby trying to reconnect to allow syncrep again... This behavior also is involvedin a rather complicated <= 9.4 bug where the transaction started by catchup interrupt processing waited for syncrepusing latches, but didn't get the wakeup because it was already running inside the same overloaded signal handler.Fix the issue here doesn't properly solve that issue, merely papers over the problems. In 9.5 catchup interruptsaren't processed out of signal handlers anymore. To fix all this, make nextval() acquire a top level xid,and only wait for transaction commit if a transaction both acquired a xid and emitted WAL records. If only a xidhas been assigned we don't uselessly want to wait just because of writes to temporary/unlogged tables; if only WAL has been written we don't want to wait just because of HOT prunes. The xid assignment in nextval() is unlikely to causeoverhead in real-world workloads. For one it only happens SEQ_LOG_VALS/32 values anyway, for another only usageof nextval() without using the result in an insert or similar is affected. Discussion: 20150223165359.GF30784@awork2.anarazel.de, 369698E947874884A77849D8FE3680C2@maumau, 5CF4ABBA67674088B3941894E22A0D25@maumau Per complaint from maumau and Thom Brown Backpatch all the way back; 9.0doesn't have syncrep, but it seems better to be consistent behavior across all maintained branches. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services