Re: BUG #13660: serializable snapshotting hangs
От | Kevin Grittner |
---|---|
Тема | Re: BUG #13660: serializable snapshotting hangs |
Дата | |
Msg-id | 1379192640.1267648.1444167495623.JavaMail.yahoo@mail.yahoo.com обсуждение исходный текст |
Ответ на | BUG #13660: serializable snapshotting hangs (cpacejo@clearskydata.com) |
Ответы |
Re: BUG #13660: serializable snapshotting hangs
|
Список | pgsql-bugs |
On Thursday, October 1, 2015 9:19 PM, "cpacejo@clearskydata.com" <cpacejo@c= learskydata.com> wrote: > PostgreSQL version: 9.4.4 > After running fine for weeks, we now find that serializable snapshots han= g: > > our_db=3D> START TRANSACTION ISOLATION LEVEL SERIALIZABLE, READ ONLY, DEF= ERRABLE; > START TRANSACTION > our_db=3D> SELECT pg_export_snapshot(); > (...hangs indefinitely...) > Killing all backends (i.e. including those accessing other databases) > unblocked serializable snapshotting. > > Is this expected behavior? What should be happening is that when the serializable read only deferrable transaction executes the first statement that needs a snapshot, it grabs a snapshot and looks for concurrent read write transactions.=C2=A0 If there are none, the snapshot is safe and it goes ahead with execution; otherwise it goes into a loop where it delays (to avoid sucking 100% CPU for a core) by calling ProcWaitForSignal().=C2=A0 Each time that function returns it checks whether any of the concurrent read write transactions developed a read-write conflict out (i.e., it read data which was modified by a transaction concurrent to *it*) to a transaction which committed before this read only transaction got its snapshot.=C2=A0 If so, this snapshot is unsafe and we need to discard it, get a new snapshot, and go back into the loop.=C2=A0 If all read write transactions concurrent to the read only transaction complete without developing such a conflict our snapshot is safe and we can proceed to run without taking predicate locks or risking a serialization failure. Now, that's pretty technical, and you definitely don't need to follow all of the above to safely use deferrable transactions.=C2=A0 The short version is that your initial statement that needs a snapshot cannot begin to execute *at least* until any concurrent transactions which are not READ ONLY complete.=C2=A0 One connection stuck "idle in transaction" can hold it up indefinitely.=C2=A0 A prepared transaction also counts as a transaction which has not yet completed. So if you had any long-running READ WRITE transactions (including one stuck "idle in transaction" or prepared but not committed), it is expected behavior.=C2=A0 If not, we should try to gather more information to figure out what's going on. One other possibility is that your workload has changed such that there are now so many read write transactions reading data modified by other transactions that it is just not finding a safe snapshot very quickly.=C2=A0 Even when running benchmarks at saturation levels on a 16 core machine we rarely saw a delay more than six seconds, but there is no upper bound on how long it might take.=C2=A0 In such a case you would have a few options, but let's not get ahead of ourselves -- the first thing is to check for "idle in transaction" or prepared transaction issues. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-bugs по дате отправления: