Обсуждение: A third lock method
I've been reviewing code to get a better handle on the scope of changes to support serializable transactions, in preparation for next month's meetings with our CIO. My posts should start getting progressively less hand-wavy. :-) I've come to a few conclusions: (1) The notions of having multiple serializable implementations (SSI, S2PL, OCC) which can be mapped as a configuration option is really not worth it. The cases where S2PL or OCC beat SSI are too narrow to be worth the effort, and the pluggable approach seems like it would be much more invasive and destabilizing than just picking one and doing it more directly. (2) If we're going with SSI, it appears that it would be a very good idea to define a third lock method (SIREAD_LOCKMETHOD perhaps) for the SIREAD locks. For one thing, that could keep them out of the way of normal conflict detection (they don't conflict with anything, per se) and out of the way of deadlock detection, including rearrangement of waiting transactions. For another, they have a different life-cycle -- they must stick around (along with some minimal transaction information) until all transactions with a snapshot prior to their commit have completed. That seems cleaner and easier with a separate lock method. Thoughts? -Kevin
Kevin Grittner wrote: > I've been reviewing code to get a better handle on the scope of > changes to support serializable transactions, in preparation for > next month's meetings with our CIO. My posts should start getting > progressively less hand-wavy. :-) > > I've come to a few conclusions: > > (1) The notions of having multiple serializable implementations > (SSI, S2PL, OCC) which can be mapped as a configuration option is > really not worth it. The cases where S2PL or OCC beat SSI are too > narrow to be worth the effort, and the pluggable approach seems like > it would be much more invasive and destabilizing than just picking > one and doing it more directly. Agreed. > (2) If we're going with SSI, it appears that it would be a very > good idea to define a third lock method (SIREAD_LOCKMETHOD perhaps) > for the SIREAD locks. For one thing, that could keep them out of > the way of normal conflict detection (they don't conflict with > anything, per se) and out of the way of deadlock detection, > including rearrangement of waiting transactions. For another, they > have a different life-cycle -- they must stick around (along with > some minimal transaction information) until all transactions with a > snapshot prior to their commit have completed. That seems cleaner > and easier with a separate lock method. I must be missing something but I thought the only problem with our existing snapshot system was that you could see a row updated after your snapshot was created, and that the solution to that was to abort the transaction that would see the new row. Can you tell me what I am missing? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > I must be missing something but I thought the only problem with our > existing snapshot system was that you could see a row updated after > your snapshot was created, and that the solution to that was to > abort the transaction that would see the new row. Can you tell me > what I am missing? Well, that's roughly on target as a 30,000 foot overview, although I think you're also working in an issue from the read committed implementation. The issue with read committed is that you start with one snapshot but may wind up combining views of data from that initial snapshot with one or more updated versions of rows from subsequent commits -- if you blocked on a conflicting write which was subsequently committed. I recently started a thread which drifted into a discussion of that issue, but it's almost completely orthogonal to the issue of implementing truly serializable transactions -- the overlap is that a fix for the read committed anomalies might provide groundwork for an optimization to the serializable implementation. After re-reading the thread on the read committed issue and pondering it a bit more, I'm inclined to think that we should worry about that if and when serializable is working and we actually see problems that the aforementioned optimization might fix. (Unless someone cares enough about the read committed anomalies to want to champion that as a separate issue.) What I'm trying to stay focused on is the serializable implementation. I'd settle for the traditional Strict 2 Phase Locking (S2PL) approach, but since there is a new technique which appears to perform much better than that for most workloads, and has most of the benefits of snapshot isolation (reads don't block writes and vice versa), I figure why not go for the gold. You are right in that the new technique will basically work like snapshot isolation but will roll back a transaction when there is one transaction which reads data which is modified by a concurrent transaction and writes data which is read by a concurrent transaction (assuming all are serializable). Years of research by Dr Alan Fekete et al has established that only when there is a cycle of such read-write dependencies among transactions, the graph contains a "pivot" transaction which has dependencies in both directions (to the same or different transactions), and a transaction which is on the write end of such an edge commits first, can snapshot isolation mode show behavior inconsistent with fully serializable behavior. The SSI technique doesn't attempt to build and analyze a complete graph of such dependencies because the overhead would be prohibitive; rather, it looks for the "pivot" which must be present in such a graph. The SSI technique requires that serializable transactions (but not transactions in other isolation levels) use a non-blocking SIREAD lock to track reads, so that the read-write dependencies among concurrent serializable transactions can be identified. The hard part is that this does require some implementation of predicate locking to prevent problems with phantom rows. Honestly, that is probably going to be over half the battle. I had been nibbling around the edges of the serializable issue trying to address optimizations which should help performance of serializable transactions by reducing the rollback rate; but came to the conclusion that that was all premature optimization and a distraction. I think the thing to do is start with the hard part, so I've been reading up on practical techniques for implementing predicate locking, and reading through the PostgreSQL locking code, trying to get my head around this. Once this hardest part is solved, I really think that a workable serializable mode is not too far past it, and *then* would be the time to look at optimizations. It's just a little bit of a stretch to call SILOCKs locks, because they don't actually block anything. They are used at various points to see where a transaction is reading data which has been modified by another transaction or vice versa. And they do need to be kept until all concurrent transactions have completed. Other than those quirks, they behave pretty much like read locks, though, so it seems to make sense to use the locking system for them. The differences are such that I thought a new lock method might be appropriate. This thread is to try to solicit opinions on whether that makes sense to anyone but me. :-) Once I sort out the subject issue, I'm about ready to try to start generating a very rough prototype of predicate locking. I don't want to start a discussion of those details on this thread, because it seems to me that a decision on the subject issue affects significant details about how I go about that. Consider this the 5,000 foot overview. Previous thread contain much more, and references to more authoritative documents. This is not a small effort. I'll be surprised if it comes in at less than 2,000 lines of code, plus documentation and tests. I'm not really expecting it to make the next release after 8.5. With some luck, maybe the next release after that. And that's all conditional on our CIO approving programming time for the effort. I'm currently trying to dig in far enough to provide more accurate estimates and time lines to facilitate approval. Hopefully I haven't just scared you off of your offer to help. ;-) -Kevin
Kevin Grittner wrote: > Once I sort out the subject issue, I'm about ready to try to start > generating a very rough prototype of predicate locking. I don't want > to start a discussion of those details on this thread, because it > seems to me that a decision on the subject issue affects significant > details about how I go about that. > > Consider this the 5,000 foot overview. Previous thread contain much > more, and references to more authoritative documents. > > This is not a small effort. I'll be surprised if it comes in at less > than 2,000 lines of code, plus documentation and tests. I'm not > really expecting it to make the next release after 8.5. With some > luck, maybe the next release after that. And that's all conditional > on our CIO approving programming time for the effort. I'm currently > trying to dig in far enough to provide more accurate estimates and > time lines to facilitate approval. > > Hopefully I haven't just scared you off of your offer to help. ;-) Thanks, I do understand it better now. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Bruce Momjian wrote: > I must be missing something but I thought the only problem with our > existing snapshot system was that you could see a row updated after your > snapshot was created, and that the solution to that was to abort the > transaction that would see the new row. Can you tell me what I am > missing? But with "snapshot isolation" (what our "serializable" corresponds to) you cannot see rows updated after snapshot creation, right? So phantom reads cannot occur, but we still are not truly serializable. See the example I concocted in http://archives.postgresql.org/pgsql-hackers/2009-05/msg00316.php for illustration. Yours, Laurenz Albe PS: Different from what Kevin claimed, Oracle also cannot grant you strictly serializable transactions, because they also use snapshot isolation. Seems that they get away with it. My feeling is that the cases where this would be a problem are pretty rare; my example referenced above feels artificial for a good reason. If we can do it better than Oracle, I'm not against it :^)
2009/12/31 Bruce Momjian <bruce@momjian.us>: > I must be missing something but I thought the only problem with our > existing snapshot system was that you could see a row updated after your > snapshot was created, and that the solution to that was to abort the > transaction that would see the new row. Can you tell me what I am > missing? The problem is rather the opposite. A minimal example of a situation that the current implementation allows, and which the new proposal tries to fix is: 1. The database contains rows X and Y having one column, and having different values for that column (i.e., X != Y). 2. "Serializable" (in the current PG sense) transactions A and B run concurrently (i.e., both take their snapshot before the other commits, so they don't see each other's changes). 3. Y := X; A reads X and updates Y to become the same as X. 4. X := Y; B reads Y and updates X to become the same as Y. Result: Sequentially executing A and B in either order leads to a result where X = Y. Still, after the above steps 1-4, the values of X and Y are switched around (and thus X != Y). Therefore, the execution was (by definition) not serializable. This is caused by the fact that in a serializable execution either A would have seen the update performed by B, or B would have seen the update performed by A. This problem is called "write skew" in the paper (their example is less theoretical, but also more complex because of the use of COUNT(..).) So instead of aborting transactions "because otherwise they would see too many changes", the goal is rather to abort transactions "because otherwise they wouldn't have seen enough changes". The SIREAD locks are used to mark "the versions that have been read by whom" (for all transactions that were concurrent with any of the active transactions), so that potentially problematic writes that occur after reads can be detected: "I wrote a new version of something that was already read by a concurrent transaction, so in any serialization, I must come after that other transaction". The other direction ("I read something that has a newer version than what I just read, so in any serialization, I must come before that other transaction") can be detected straightforwardly. Nicolas
"Albe Laurenz" wrote: > See the example I concocted in > http://archives.postgresql.org/pgsql-hackers/2009-05/msg00316.php Sure, let's look at that example. Of course, *any* transaction run by itself won't show differences from true serializable behavior *regardless* of the mode in which it runs -- because it actually was serialized. Let's see how your example might work if the function was being run on two different backends at the same time with different personid values. Connection 1: ========== [Currently no highlander; the function does this for personid = 1] SELECT ishighlander INTO b FROM scots WHERE id=personid; IF b THEN RETURN; /* no need to do anything */ END IF; UPDATE scots SET ishighlander=TRUE WHERE id=personid; SELECT count(*) INTO n FROM scots WHERE ishighlander; IF (n > 1) THEN RAISE EXCEPTION 'There can be only one'; END IF; [Connection 1 now sees a highlander; not yet committed] Connection 2: =========== [Currently no highlander according to this snapshot] [the function does exactly the same thing as on Connection 1,but for personid 2] [It doesn't see the work of Connection 1,so it's count shows the update is OK] Now they commit, in either order. You now have two highlanders in the database. You have just demonstrated another case of write skew, where snapshot isolation does not behave in a truly serializable fashion, allowing constraints enforced in application software or functions (including triggers) to be violated. With the changes I'm working on, one of these would be rolled back with a serialization error. > PS: Different from what Kevin claimed, Oracle also cannot grant > you strictly serializable transactions, because they also use > snapshot isolation. Apologies if that is still true. I don't use Oracle and one of the recent articles I recently read seemed to indicate otherwise. Thanks for the correction. -Kevin
"Albe Laurenz" wrote: > But with "snapshot isolation" (what our "serializable" corresponds > to) you cannot see rows updated after snapshot creation, right? > > So phantom reads cannot occur, but we still are not truly > serializable. My previous reply missed your point entirely, didn't it? Let me try again. You are absolutely right that the phantom phantom rows can't pop up during a transaction running at snapshot isolation level. So the "phantom read" problem, per se, is not an issue. The problem with phantoms rows in snapshot isolation is not that they pop up within a concurrent transaction, but that a concurrent transaction does not block on trying to read them (as they would with an S2PL serializable implementation) but will just miss them, even though they might later be (or might already be) committed ahead of the current transaction. They need to be considered in the SSI read-write dependency checks. Perhaps the affect of such inserts (or updates into a selected range) manifest is a different enough way that a new term is merited, but I'm inclined to think not. The conditions in which they become an issue are the same. The techniques for detecting them are the same. These phantoms just appear to the current connection upon commit of its transaction rather than in the middle of it, but either way they cause problems if the current transaction is modifying data based on its view of them. -Kevin
I wrote: > It's just a little bit of a stretch to call SILOCKs locks, because > they don't actually block anything. They are used at various points > to see where a transaction is reading data which has been modified > by another transaction or vice versa. And they do need to be kept > until all concurrent transactions have completed. Other than those > quirks, they behave pretty much like read locks, though, so it > seems to make sense to use the locking system for them. The > differences are such that I thought a new lock method might be > appropriate. This thread is to try to solicit opinions on whether > that makes sense to anyone but me. :-) > > Once I sort out the subject issue, I'm about ready to try to start > generating a very rough prototype of predicate locking. I don't > want to start a discussion of those details on this thread, because > it seems to me that a decision on the subject issue affects > significant details about how I go about that. Based on feedback from Robert Haas on another thread, I think this thread should be considered wrapped. It seems to me like SIREAD locks should be handled by a different lock method, but before I go there I will probably initially develop and test the predicate locking logic by using ACCESS EXCLUSIVE locks for all reads, just to confirm correct coverage of the predicates. Thanks, all. -Kevin