Обсуждение: Remove HeapTuple and Buffer dependency for predicate locking functions
Proposing following changes to make predicate locking and checking
functions generic and remove dependency on HeapTuple and Heap AM. Wemade these changes to help with Zedstore. I think the changes should
help Zheap and other AMs in general.
- Change PredicateLockTuple() to PredicateLockTID(). So, instead of
passing HeapTuple to it, just pass ItemPointer and tuple insert
transaction id if known. This was also discussed earlier in [1],
don't think was done in that context but would be helpful in future
if such requirement comes up as well.
- CheckForSerializableConflictIn() take blocknum instead of
buffer. Currently, the function anyways does nothing with the buffer
just needs blocknum. Also, helps to decouple dependency on buffer as
not all AMs may have one to one notion between blocknum and single
buffer. Like for zedstore, tuple is stored across individual column
buffers. So, wish to have way to lock not physical buffer but
logical blocknum.
- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could be
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after performing
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.
Attaching patch which makes these changes.
This way PredicateLockTID(), CheckForSerializableConflictIn() and
CheckForSerializableConflictOut() functions become usable by any AM.
1] https://www.postgresql.org/message-id/CAEepm%3D2QbqQ_%2BKQQCnhKukF6NEAeq4SqiO3Qxe%2BfHza5-H-jKA%40mail.gmail.com
Вложения
Hi, On 2019-06-24 10:41:06 -0700, Ashwin Agrawal wrote: > Proposing following changes to make predicate locking and checking > functions generic and remove dependency on HeapTuple and Heap AM. We > made these changes to help with Zedstore. I think the changes should > help Zheap and other AMs in general. Indeed. > - Change PredicateLockTuple() to PredicateLockTID(). So, instead of > passing HeapTuple to it, just pass ItemPointer and tuple insert > transaction id if known. This was also discussed earlier in [1], > don't think was done in that context but would be helpful in future > if such requirement comes up as well. Right. > - CheckForSerializableConflictIn() take blocknum instead of > buffer. Currently, the function anyways does nothing with the buffer > just needs blocknum. Also, helps to decouple dependency on buffer as > not all AMs may have one to one notion between blocknum and single > buffer. Like for zedstore, tuple is stored across individual column > buffers. So, wish to have way to lock not physical buffer but > logical blocknum. Hm. I wonder if we somehow ought to generalize the granularity scheme for predicate locks to not be tuple/page/relation. But even if, that's probably a separate patch. > - CheckForSerializableConflictOut() no more takes HeapTuple nor > buffer, instead just takes xid. Push heap specific parts from > CheckForSerializableConflictOut() into its own function > HeapCheckForSerializableConflictOut() which calls > CheckForSerializableConflictOut(). The alternative option could be > CheckForSerializableConflictOut() take callback function and > callback arguments, which gets called if required after performing > prechecks. Though currently I fell AM having its own wrapper to > perform AM specific task and then calling > CheckForSerializableConflictOut() is fine. I think it's right to move the xid handling out of CheckForSerializableConflictOut(). But I think we also ought to move the subtransaction handling out of the function - e.g. zheap doesn't want/need that. > Attaching patch which makes these changes. Please make sure that there's a CF entry for this (I'm in a plane with a super slow connection, otherwise I'd check). Greetings, Andres Freund
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote: > On 2019-06-24 10:41:06 -0700, Ashwin Agrawal wrote: > > Proposing following changes to make predicate locking and checking > > functions generic and remove dependency on HeapTuple and Heap AM. We > > made these changes to help with Zedstore. I think the changes should > > help Zheap and other AMs in general. > > Indeed. +1 > > - Change PredicateLockTuple() to PredicateLockTID(). So, instead of > > passing HeapTuple to it, just pass ItemPointer and tuple insert > > transaction id if known. This was also discussed earlier in [1], > > don't think was done in that context but would be helpful in future > > if such requirement comes up as well. > > Right. +1 > > - CheckForSerializableConflictIn() take blocknum instead of > > buffer. Currently, the function anyways does nothing with the buffer > > just needs blocknum. Also, helps to decouple dependency on buffer as > > not all AMs may have one to one notion between blocknum and single > > buffer. Like for zedstore, tuple is stored across individual column > > buffers. So, wish to have way to lock not physical buffer but > > logical blocknum. > > Hm. I wonder if we somehow ought to generalize the granularity scheme > for predicate locks to not be tuple/page/relation. But even if, that's > probably a separate patch. What do you have in mind? This is certainly the traditional way to do lock hierarchies (archeological fun: we used to have src/backend/storage/lock/multi.c, a "standard multi-level lock manager as per the Gray paper", before commits 3f7fbf85 and e6e9e18e decommissioned it; SSI brought the concept back again in a parallel lock manager), but admittedly it has assumptions about physical storage baked into the naming. Perhaps you just want to give those things different labels, "TID range" instead of page, for the benefit of "logical" TID users? Perhaps you want to permit more levels? That seems premature as long as TIDs are defined in terms of blocks and offsets, so this stuff is reflected all over the source tree... > > - CheckForSerializableConflictOut() no more takes HeapTuple nor > > buffer, instead just takes xid. Push heap specific parts from > > CheckForSerializableConflictOut() into its own function > > HeapCheckForSerializableConflictOut() which calls > > CheckForSerializableConflictOut(). The alternative option could be > > CheckForSerializableConflictOut() take callback function and > > callback arguments, which gets called if required after performing > > prechecks. Though currently I fell AM having its own wrapper to > > perform AM specific task and then calling > > CheckForSerializableConflictOut() is fine. > > I think it's right to move the xid handling out of > CheckForSerializableConflictOut(). But I think we also ought to move the > subtransaction handling out of the function - e.g. zheap doesn't > want/need that. Thoughts on this Ashwin? > > Attaching patch which makes these changes. > > Please make sure that there's a CF entry for this (I'm in a plane with a > super slow connection, otherwise I'd check). This all makes sense, and I'd like to be able to commit this soon. We come up with something nearly identical for zheap. I'm subscribing Kuntal Ghosh to see if he has comments, as he worked on that. The main point of difference seems to be the assumption about how subtransactions work. -- Thomas Munro https://enterprisedb.com
On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote:
> > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> > buffer, instead just takes xid. Push heap specific parts from
> > CheckForSerializableConflictOut() into its own function
> > HeapCheckForSerializableConflictOut() which calls
> > CheckForSerializableConflictOut(). The alternative option could be
> > CheckForSerializableConflictOut() take callback function and
> > callback arguments, which gets called if required after performing
> > prechecks. Though currently I fell AM having its own wrapper to
> > perform AM specific task and then calling
> > CheckForSerializableConflictOut() is fine.
>
> I think it's right to move the xid handling out of
> CheckForSerializableConflictOut(). But I think we also ought to move the
> subtransaction handling out of the function - e.g. zheap doesn't
> want/need that.
Thoughts on this Ashwin?
Hi, On 2019-07-31 09:57:58 +1200, Thomas Munro wrote: > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote: > > Hm. I wonder if we somehow ought to generalize the granularity scheme > > for predicate locks to not be tuple/page/relation. But even if, that's > > probably a separate patch. > > What do you have in mind? My concern is that continuing to inferring the granularity levels from the tid doesn't seem like a great path forward. An AMs use of tids might not necessarily be very amenable to that, if the mapping isn't actually block based. > Perhaps you just want to give those things different labels, "TID > range" instead of page, for the benefit of "logical" TID users? > Perhaps you want to permit more levels? That seems premature as long > as TIDs are defined in terms of blocks and offsets, so this stuff is > reflected all over the source tree... I'm mostly wondering if the different levels shouldn't be computed outside of predicate.c. Greetings, Andres Freund
Hi, On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote: > On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote: > > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor > > > > buffer, instead just takes xid. Push heap specific parts from > > > > CheckForSerializableConflictOut() into its own function > > > > HeapCheckForSerializableConflictOut() which calls > > > > CheckForSerializableConflictOut(). The alternative option could be > > > > CheckForSerializableConflictOut() take callback function and > > > > callback arguments, which gets called if required after performing > > > > prechecks. Though currently I fell AM having its own wrapper to > > > > perform AM specific task and then calling > > > > CheckForSerializableConflictOut() is fine. > > > > > > I think it's right to move the xid handling out of > > > CheckForSerializableConflictOut(). But I think we also ought to move the > > > subtransaction handling out of the function - e.g. zheap doesn't > > > want/need that. > > > > Thoughts on this Ashwin? > > > > I think the only part its doing for sub-transaction is > SubTransGetTopmostTransaction(xid). If xid passed to this function is > already top most transaction which is case for zheap and zedstore, then > there is no downside to keeping that code here in common place. Well, it's far from a cheap function. It'll do unnecessary on-disk lookups in many cases. I'd call that quite a downside. Greetings, Andres Freund
On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
> On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote:
> > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> > > > buffer, instead just takes xid. Push heap specific parts from
> > > > CheckForSerializableConflictOut() into its own function
> > > > HeapCheckForSerializableConflictOut() which calls
> > > > CheckForSerializableConflictOut(). The alternative option could be
> > > > CheckForSerializableConflictOut() take callback function and
> > > > callback arguments, which gets called if required after performing
> > > > prechecks. Though currently I fell AM having its own wrapper to
> > > > perform AM specific task and then calling
> > > > CheckForSerializableConflictOut() is fine.
> > >
> > > I think it's right to move the xid handling out of
> > > CheckForSerializableConflictOut(). But I think we also ought to move the
> > > subtransaction handling out of the function - e.g. zheap doesn't
> > > want/need that.
> >
> > Thoughts on this Ashwin?
> >
>
> I think the only part its doing for sub-transaction is
> SubTransGetTopmostTransaction(xid). If xid passed to this function is
> already top most transaction which is case for zheap and zedstore, then
> there is no downside to keeping that code here in common place.
Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.
Instead of moving the handling out of the function, how do feel about adding boolean isTopTransactionId argument to function CheckForSerializableConflictOut(). The AMs, which implicitly know, only pass top transaction Id to this function, can pass true and avoid the function call to SubTransGetTopmostTransaction(xid). With this subtransaction code remains in generic place and AMs intending to use it continue to leverage the common code, plus explicitly clarifies the behavior as well.
On Wed, Jul 31, 2019 at 12:37 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <andres@anarazel.de> wrote:Hi,
On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
> On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote:
> > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor
> > > > buffer, instead just takes xid. Push heap specific parts from
> > > > CheckForSerializableConflictOut() into its own function
> > > > HeapCheckForSerializableConflictOut() which calls
> > > > CheckForSerializableConflictOut(). The alternative option could be
> > > > CheckForSerializableConflictOut() take callback function and
> > > > callback arguments, which gets called if required after performing
> > > > prechecks. Though currently I fell AM having its own wrapper to
> > > > perform AM specific task and then calling
> > > > CheckForSerializableConflictOut() is fine.
> > >
> > > I think it's right to move the xid handling out of
> > > CheckForSerializableConflictOut(). But I think we also ought to move the
> > > subtransaction handling out of the function - e.g. zheap doesn't
> > > want/need that.
> >
> > Thoughts on this Ashwin?
> >
>
> I think the only part its doing for sub-transaction is
> SubTransGetTopmostTransaction(xid). If xid passed to this function is
> already top most transaction which is case for zheap and zedstore, then
> there is no downside to keeping that code here in common place.
Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.Okay, agree, its costly function and better to avoid the call if possible.Instead of moving the handling out of the function, how do feel about adding boolean isTopTransactionId argument to function CheckForSerializableConflictOut(). The AMs, which implicitly know, only pass top transaction Id to this function, can pass true and avoid the function call to SubTransGetTopmostTransaction(xid). With this subtransaction code remains in generic place and AMs intending to use it continue to leverage the common code, plus explicitly clarifies the behavior as well.
Added argument to function to make the subtransaction handling optional in attached v2 of patch.
Вложения
Hi, On 2019-07-31 12:37:58 -0700, Ashwin Agrawal wrote: > On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <andres@anarazel.de> wrote: > > > Hi, > > > > On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote: > > > On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com> > > wrote: > > > > > > > On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> > > wrote: > > > > > > - CheckForSerializableConflictOut() no more takes HeapTuple nor > > > > > > buffer, instead just takes xid. Push heap specific parts from > > > > > > CheckForSerializableConflictOut() into its own function > > > > > > HeapCheckForSerializableConflictOut() which calls > > > > > > CheckForSerializableConflictOut(). The alternative option could > > be > > > > > > CheckForSerializableConflictOut() take callback function and > > > > > > callback arguments, which gets called if required after > > performing > > > > > > prechecks. Though currently I fell AM having its own wrapper to > > > > > > perform AM specific task and then calling > > > > > > CheckForSerializableConflictOut() is fine. > > > > > > > > > > I think it's right to move the xid handling out of > > > > > CheckForSerializableConflictOut(). But I think we also ought to move > > the > > > > > subtransaction handling out of the function - e.g. zheap doesn't > > > > > want/need that. > > > > > > > > Thoughts on this Ashwin? > > > > > > > > > > I think the only part its doing for sub-transaction is > > > SubTransGetTopmostTransaction(xid). If xid passed to this function is > > > already top most transaction which is case for zheap and zedstore, then > > > there is no downside to keeping that code here in common place. > > > > Well, it's far from a cheap function. It'll do unnecessary on-disk > > lookups in many cases. I'd call that quite a downside. > > > > Okay, agree, its costly function and better to avoid the call if possible. > > Instead of moving the handling out of the function, how do feel about > adding boolean isTopTransactionId argument to function > CheckForSerializableConflictOut(). The AMs, which implicitly know, only > pass top transaction Id to this function, can pass true and avoid the > function call to SubTransGetTopmostTransaction(xid). With this > subtransaction code remains in generic place and AMs intending to use it > continue to leverage the common code, plus explicitly clarifies the > behavior as well. Looking at the code as of master, we currently have: - PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure out a whether the tuple has been locked by the current transaction. That check afaict just should be TransactionIdIsCurrentTransactionId(), without all the other stuff that's done today. TransactionIdIsCurrentTransactionId() imo ought to be optimized to always check for the top level transactionid first - that's a good bet today, but even moreso for the upcoming AMs that won't have separate xids for subtransactions. Alternatively we shouldn't make that a binary search for each subtrans level, but just have a small simplehash hashtable for xids. - CheckForSerializableConflictOut() wants to get the toplevel xid for the tuple, because that's the one the predicate hashtable stores. In your patch you've already moved the HTSV() call etc out of CheckForSerializableConflictOut(). I'm somewhat inclined to think that the SubTransGetTopmostTransaction() call ought to go along with that. I don't really think that belongs in predicate.c, especially if most/all new AMs don't use subtransaction ids. The only downside is that currently the TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check avoids the SubTransGetTopmostTransaction() check. But again, the better fix for that seems to be to improve the generic code. As written the check won't prevent a subtrans lookup for heap when subtransactions are in use, and it's IME pretty common for tuples to get looked at again in the transaction that has created them. So I'm somewhat inclined to think that SubTransGetTopmostTransaction() should have a fast-path for the current transaction - probably just employing TransactionIdIsCurrentTransactionId(). I don't really see what we gain by having the subtrans handling in the predicate code. Especially given that we've already moved the HTSV() handling out, it seems architecturally the wrong place to me - but I admit that that's a fuzzy argument. The relevant mapping should be one line in the caller. I wonder if it'd be wroth to combine the TransactionIdIsCurrentTransactionId() calls in the heap cases that currently do both, PredicateLockTuple() and HeapCheckForSerializableConflictOut(). The heap_fetch() case probably isn't commonly that hot a pathq, but heap_hot_search_buffer() is. Minor notes: - I don't think 'insert_xid' is necessarily great - it could also be the updating xid etc. And while you can argue that an update is an insert in the current heap, that's not the case for future AMs. - to me @@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer, if (valid) { ItemPointerSetOffsetNumber(tid, offnum); - PredicateLockTuple(relation, heapTuple, snapshot); + PredicateLockTID(relation, &(heapTuple)->t_self, snapshot, + HeapTupleHeaderGetXmin(heapTuple->t_data)); if (all_dead) *all_dead = false; return true; What are those parens - as placed they can't do anything. Did you intend to write &(heapTuple->t_self)? Even that is pretty superfluous, but it at least clarifies the precedence. I'm also a bit confused why we don't need to pass in the offset of the current tuple, rather than the HOT root tuple here. That's not related to this patch. But aren't we locking the wrong tuple here, in case of HOT? - I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a portion of it's code as a static inline. In particular, it's a shame that we currently perform external function calls at quite the frequency when serializable isn't even in use. Greetings, Andres Freund
On Thu, Aug 1, 2019 at 2:36 AM Andres Freund <andres@anarazel.de> wrote: > > > > > > > > I think the only part its doing for sub-transaction is > > > > SubTransGetTopmostTransaction(xid). If xid passed to this function is > > > > already top most transaction which is case for zheap and zedstore, then > > > > there is no downside to keeping that code here in common place. > > > > > > Well, it's far from a cheap function. It'll do unnecessary on-disk > > > lookups in many cases. I'd call that quite a downside. > > > > > > > Okay, agree, its costly function and better to avoid the call if possible. > > > > Instead of moving the handling out of the function, how do feel about > > adding boolean isTopTransactionId argument to function > > CheckForSerializableConflictOut(). The AMs, which implicitly know, only > > pass top transaction Id to this function, can pass true and avoid the > > function call to SubTransGetTopmostTransaction(xid). With this > > subtransaction code remains in generic place and AMs intending to use it > > continue to leverage the common code, plus explicitly clarifies the > > behavior as well. > > Looking at the code as of master, we currently have: > > - PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure > out a whether the tuple has been locked by the current > transaction. That check afaict just should be > TransactionIdIsCurrentTransactionId(), without all the other > stuff that's done today. > Yeah. this is the only part where predicate locking uses the subxids. Since, predicate locking always use the top xid, IMHO, it'll be good to make this api independent of subxids. > TransactionIdIsCurrentTransactionId() imo ought to be optimized to > always check for the top level transactionid first - that's a good bet > today, but even moreso for the upcoming AMs that won't have separate > xids for subtransactions. Alternatively we shouldn't make that a > binary search for each subtrans level, but just have a small > simplehash hashtable for xids. A check for top transaction id first and usage of simple sound like good optimizations. But, I'm not sure whether these changes should be part of this patch or a separate one. -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com
On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote:
Looking at the code as of master, we currently have:
Super awesome feedback and insights, thank you!
- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
out a whether the tuple has been locked by the current
transaction. That check afaict just should be
TransactionIdIsCurrentTransactionId(), without all the other
stuff that's done today.
Agree. v1-0002 patch attached does that now. Please let me know if that's what you meant.
TransactionIdIsCurrentTransactionId() imo ought to be optimized to
always check for the top level transactionid first - that's a good bet
today, but even moreso for the upcoming AMs that won't have separate
xids for subtransactions. Alternatively we shouldn't make that a
binary search for each subtrans level, but just have a small
simplehash hashtable for xids.
v1-0001 patch checks for GetTopTransactionIdIfAny() first in TransactionIdIsCurrentTransactionId() which seems yes better in general and more for future. That mostly meets the needs for current discussion.
The alternative of not using binary search seems bigger refactoring and should be handled as separate optimization exercise outside of this thread.
- CheckForSerializableConflictOut() wants to get the toplevel xid for
the tuple, because that's the one the predicate hashtable stores.
In your patch you've already moved the HTSV() call etc out of
CheckForSerializableConflictOut(). I'm somewhat inclined to think that
the SubTransGetTopmostTransaction() call ought to go along with that.
I don't really think that belongs in predicate.c, especially if
most/all new AMs don't use subtransaction ids.
The only downside is that currently the
TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
avoids the SubTransGetTopmostTransaction() check.
But again, the better fix for that seems to be to improve the generic
code. As written the check won't prevent a subtrans lookup for heap
when subtransactions are in use, and it's IME pretty common for tuples
to get looked at again in the transaction that has created them. So
I'm somewhat inclined to think that SubTransGetTopmostTransaction()
should have a fast-path for the current transaction - probably just
employing TransactionIdIsCurrentTransactionId().
That optimization, as Kuntal also mentioned, seems something which can be done on-top afterwards on current patch.
I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument. The relevant mapping should be one
line in the caller.
Okay, I moved the sub transaction handling out of CheckForSerializableConflictOut() and have it along side HTSV() now.
The reason I felt leaving subtransaction handling in generic place, was it might be premature to thing no future AM will need it. Plus, all serializable function api's having same expectations is easier. Like PredicateLockTuple() can be passed top or subtransaction id and it can handle it but with the change CheckForSerializableConflictOut() only be feed top transaction ID. But its fine and can see the point of AM needing it can easily get top transaction ID and feed it as heap.
I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.
Maybe, will give thought to it separate from the current patch.
Minor notes:
- I don't think 'insert_xid' is necessarily great - it could also be the
updating xid etc. And while you can argue that an update is an insert
in the current heap, that's not the case for future AMs.
- to me
@@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
if (valid)
{
ItemPointerSetOffsetNumber(tid, offnum);
- PredicateLockTuple(relation, heapTuple, snapshot);
+ PredicateLockTID(relation, &(heapTuple)->t_self, snapshot,
+ HeapTupleHeaderGetXmin(heapTuple->t_data));
if (all_dead)
*all_dead = false;
return true;
What are those parens - as placed they can't do anything. Did you
intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
but it at least clarifies the precedence.
Fixed. No idea what I was thinking there, mostly feel I intended to have it as like &(heapTuple->t_self).
I'm also a bit confused why we don't need to pass in the offset of the
current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?
Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though, can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self).
- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
portion of it's code as a static inline. In particular, it's a shame
that we currently perform external function calls at quite the
frequency when serializable isn't even in use.
I understand that with refactor, HeapCheckForSerializableConflictOut() is called which calls CheckForSerializableConflictOutNeeded(). If that's the problem, for addressing the same, I had proposed alternative way to refactor. CheckForSerializableConflictOut() can take callback function and void* callback argument for AM specific check instead. So, the flow would be AM calling CheckForSerializableConflictOut() as today and only if serializable in use will invoke the callback to check with AM if more work should be performed or not. Essentially HeapCheckForSerializableConflictOut() will become callback function instead. Due to void* callback argument aspect I didn't like that solution and felt AM performing checks and calling CheckForSerializableConflictOut() seems more straight forward.
Вложения
On Sat, Aug 3, 2019 at 11:56 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote: > On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote: >> I'm also a bit confused why we don't need to pass in the offset of the >> current tuple, rather than the HOT root tuple here. That's not related >> to this patch. But aren't we locking the wrong tuple here, in case of >> HOT? > > Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though,can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self). Here are three relevant commits: 1. Commit dafaa3efb75 "Implement genuine serializable isolation level." (2011) locked the root tuple, because it set t_self to *tid. Possibly due to confusion about the effect of the preceding line ItemPointerSetOffsetNumber(tid, offnum). 2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013) fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self, offnum). 3. Commit b89e151054a "Introduce logical decoding." (2014) also did ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer), offnum), for the benefit of historical MVCC snapshots (unnecessarily, considering the change in the commit #2), but then, intending to "reset to original, non-redirected, tid", clobbered it, reintroducing the bug fixed by #2. My first guess is that commit #3 above was developed before commit #2, and finished up clobbering it. In fact, both logical decoding and SSI want offnum, so we should be able to just remove the "reset" bit (perhaps like in the attached sketch, not really tested, though it passes). This must be in want of an isolation test, but I haven't yet tried to get my head around how to write a test that would show the difference. -- Thomas Munro https://enterprisedb.com
Вложения
Hi, On 2019-08-05 20:58:05 +1200, Thomas Munro wrote: > On Sat, Aug 3, 2019 at 11:56 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote: > > On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote: > >> I'm also a bit confused why we don't need to pass in the offset of the > >> current tuple, rather than the HOT root tuple here. That's not related > >> to this patch. But aren't we locking the wrong tuple here, in case of > >> HOT? > > > > Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though,can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self). > > Here are three relevant commits: Thanks for digging! > 1. Commit dafaa3efb75 "Implement genuine serializable isolation > level." (2011) locked the root tuple, because it set t_self to *tid. > Possibly due to confusion about the effect of the preceding line > ItemPointerSetOffsetNumber(tid, offnum). > > 2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013) > fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self, > offnum). Hm. It's not at all sure that it's ok to report the non-root tuple tid here. I.e. I'm fairly sure there was a reason to not just set it to the actual tid. I think I might have written that up on the list at some point. Let me dig in memory and list. Obviously possible that that was also obsoleted by parallel changes. > 3. Commit b89e151054a "Introduce logical decoding." (2014) also did > ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer), > offnum), for the benefit of historical MVCC snapshots (unnecessarily, > considering the change in the commit #2), but then, intending to > "reset to original, non-redirected, tid", clobbered it, reintroducing > the bug fixed by #2. > My first guess is that commit #3 above was developed before commit #2, > and finished up clobbering it. Yea, that sounds likely. > This must be in want of an isolation test, but I haven't yet tried to > get my head around how to write a test that would show the difference. Indeed. Greetings, Andres Freund
On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <andres@anarazel.de> wrote: > On 2019-08-05 20:58:05 +1200, Thomas Munro wrote: > > 1. Commit dafaa3efb75 "Implement genuine serializable isolation > > level." (2011) locked the root tuple, because it set t_self to *tid. > > Possibly due to confusion about the effect of the preceding line > > ItemPointerSetOffsetNumber(tid, offnum). > > > > 2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013) > > fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self, > > offnum). > > Hm. It's not at all sure that it's ok to report the non-root tuple tid > here. I.e. I'm fairly sure there was a reason to not just set it to the > actual tid. I think I might have written that up on the list at some > point. Let me dig in memory and list. Obviously possible that that was > also obsoleted by parallel changes. Adding Heikki and Kevin. I haven't found your earlier discussion about that yet, but would be keen to read it if you can find it. I wondered if your argument might have had something to do with heap pruning, but I can't find a problem there. It's not as though the TID of any visible tuples change, it's just that some dead stuff goes away and the root line pointer is changed to LP_REDIRECT if it wasn't already. As for the argument for locking the tuple we emit rather than the HOT root, I think the questions are: (1) How exactly do we get away with locking only one version in a regular non-HOT update chain? (2) Is it OK to do that with a HOT root? The answer to the first question is given in README-SSI[1]. Unfortunately it doesn't discuss HOT directly, but I suspect the answer is no, HOT is not special here. By my reading, it relies on the version you lock being the version visible to your snapshot, which is important because later updates have to touch that tuple to write the next version. That doesn't apply to some arbitrarily older tuple that happens to be a HOT root. Concretely, heap_update() does CheckForSerializableConflictIn(relation, &oldtup, buffer), which is only going to produce a rw conflict if T1 took an SIREAD on precisely the version T2 locks in that path, not some arbitrarily older version that happens to be a HOT root. A HOT root might never be considered again by concurrent writers, no? As a minor consequence, the optimisation in CheckTargetForConflictsIn() assumes that a tuple being updated has the same tag as we locked when reading the tuple, which isn't the case if we locked the root while reading but now have the TID for the version we actually read, so in master we leak a tuple lock unnecessarily until end-of-transaction when we update a HOT tuple. > > This must be in want of an isolation test, but I haven't yet tried to > > get my head around how to write a test that would show the difference. > > Indeed. One practical problem is that the only way to reach PredicateLockTuple() is from an index scan, and the index scan locks the index page (or the whole index, depending on rd_indam->ampredlocks). So I think if you want to see a serialization anomaly you'll need multiple indexes (so that index page locks don't hide the problem), a long enough HOT chain and then probably several transactions to be able to miss a cycle that should be picked up by the logic in [1]. I'm out of steam for this problem today though. The simple test from the report[3] that resulted in commit 81fbbfe3352 doesn't work for me (ie with permutation "r1" "r2" "w1" "w2" "c1" "c2" twice in a row). The best I've come up with so far is an assertion that we predicate-lock the same row version that we emitted to the user, when reached via an index lookup that visits a HOT row. The test outputs 'f' for master, but 't' with the change to heapam.c. [1] Excerpt from README-SSI: === * PostgreSQL does not use "update in place" with a rollback log for its MVCC implementation. Where possible it uses "HOT" updates on the same page (if there is room and no indexed value is changed). For non-HOT updates the old tuple is expired in place and a new tuple is inserted at a new location. Because of this difference, a tuple lock in PostgreSQL doesn't automatically lock any other versions of a row. We don't try to copy or expand a tuple lock to any other versions of the row, based on the following proof that any additional serialization failures we would get from that would be false positives: o If transaction T1 reads a row version (thus acquiring a predicate lock on it) and a second transaction T2 updates that row version (thus creating a rw-conflict graph edge from T1 to T2), must a third transaction T3 which re-updates the new version of the row also have a rw-conflict in from T1 to prevent anomalies? In other words, does it matter whether we recognize the edge T1 -> T3? o If T1 has a conflict in, it certainly doesn't. Adding the edge T1 -> T3 would create a dangerous structure, but we already had one from the edge T1 -> T2, so we would have aborted something anyway. (T2 has already committed, else T3 could not have updated its output; but we would have aborted either T1 or T1's predecessor(s). Hence no cycle involving T1 and T3 can survive.) o Now let's consider the case where T1 doesn't have a rw-conflict in. If that's the case, for this edge T1 -> T3 to make a difference, T3 must have a rw-conflict out that induces a cycle in the dependency graph, i.e. a conflict out to some transaction preceding T1 in the graph. (A conflict out to T1 itself would be problematic too, but that would mean T1 has a conflict in, the case we already eliminated.) o So now we're trying to figure out if there can be an rw-conflict edge T3 -> T0, where T0 is some transaction that precedes T1. For T0 to precede T1, there has to be some edge, or sequence of edges, from T0 to T1. At least the last edge has to be a wr-dependency or ww-dependency rather than a rw-conflict, because T1 doesn't have a rw-conflict in. And that gives us enough information about the order of transactions to see that T3 can't have a rw-conflict to T0: - T0 committed before T1 started (the wr/ww-dependency implies this) - T1 started before T2 committed (the T1->T2 rw-conflict implies this) - T2 committed before T3 started (otherwise, T3 would get aborted because of an update conflict) o That means T0 committed before T3 started, and therefore there can't be a rw-conflict from T3 to T0. o So in all cases, we don't need the T1 -> T3 edge to recognize cycles. Therefore it's not necessary for T1's SIREAD lock on the original tuple version to cover later versions as well. === [2] https://www.postgresql.org/message-id/52527E4D.4060302%40vmware.com [3] https://www.postgresql.org/message-id/flat/523C29A8.20904%40vmware.com -- Thomas Munro https://enterprisedb.com
Вложения
Hello Thomas, On Tue, Aug 6, 2019 at 9:50 AM Thomas Munro <thomas.munro@gmail.com> wrote: > > On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <andres@anarazel.de> wrote: > > On 2019-08-05 20:58:05 +1200, Thomas Munro wrote: > > > 1. Commit dafaa3efb75 "Implement genuine serializable isolation > > > level." (2011) locked the root tuple, because it set t_self to *tid. > > > Possibly due to confusion about the effect of the preceding line > > > ItemPointerSetOffsetNumber(tid, offnum). > > > > > > 2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013) > > > fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self, > > > offnum). > > > > Hm. It's not at all sure that it's ok to report the non-root tuple tid > > here. I.e. I'm fairly sure there was a reason to not just set it to the > > actual tid. I think I might have written that up on the list at some > > point. Let me dig in memory and list. Obviously possible that that was > > also obsoleted by parallel changes. > > Adding Heikki and Kevin. > > I haven't found your earlier discussion about that yet, but would be > keen to read it if you can find it. I wondered if your argument > might have had something to do with heap pruning, but I can't find a > problem there. It's not as though the TID of any visible tuples > change, it's just that some dead stuff goes away and the root line > pointer is changed to LP_REDIRECT if it wasn't already. > > As for the argument for locking the tuple we emit rather than the HOT > root, I think the questions are: (1) How exactly do we get away with > locking only one version in a regular non-HOT update chain? (2) Is it > OK to do that with a HOT root? > > The answer to the first question is given in README-SSI[1]. > Unfortunately it doesn't discuss HOT directly, but I suspect the > answer is no, HOT is not special here. By my reading, it relies on > the version you lock being the version visible to your snapshot, which > is important because later updates have to touch that tuple to write > the next version. That doesn't apply to some arbitrarily older tuple > that happens to be a HOT root. Concretely, heap_update() does > CheckForSerializableConflictIn(relation, &oldtup, buffer), which is > only going to produce a rw conflict if T1 took an SIREAD on precisely > the version T2 locks in that path, not some arbitrarily older version > that happens to be a HOT root. A HOT root might never be considered > again by concurrent writers, no? > If I understand the problem, this is the same serialization issue as with in-place updates for zheap. I had a discussion with Kevin regarding the same in this thread [1]. It seems if we're locking the hot root id, we may report some false positive serializable errors. > > > This must be in want of an isolation test, but I haven't yet tried to > > > get my head around how to write a test that would show the difference. > > > > Indeed. > > One practical problem is that the only way to reach > PredicateLockTuple() is from an index scan, and the index scan locks > the index page (or the whole index, depending on > rd_indam->ampredlocks). So I think if you want to see a serialization > anomaly you'll need multiple indexes (so that index page locks don't > hide the problem), a long enough HOT chain and then probably several > transactions to be able to miss a cycle that should be picked up by > the logic in [1]. I'm out of steam for this problem today though. > > The simple test from the report[3] that resulted in commit 81fbbfe3352 > doesn't work for me (ie with permutation "r1" "r2" "w1" "w2" "c1" "c2" > twice in a row). The best I've come up with so far is an assertion > that we predicate-lock the same row version that we emitted to the > user, when reached via an index lookup that visits a HOT row. The > test outputs 'f' for master, but 't' with the change to heapam.c. > Here is an example from the multiple-row-versions isolation test which fails if we perform in-place updates for zheap. I think the same will be relevant if we lock root tuple id instead of the tuple itself. Step 1: T1-> BEGIN; Read FROM t where id=1000000; Step 2: T2-> BEGIN; UPDATE t where id=1000000; COMMIT; (creates T1->T2) Step 3: T3-> BEGIN; UPDATE t where id=1000000; Read FROM t where id=500000; Step 4: T4-> BEGIN; UPDATE t where id= 500000; Read FROM t where id=1; COMMIT; (creates T3->T4) Step 5: T3-> COMMIT; Step 6: T1-> UPDATE t where id=1; COMMIT; (creates T4->T1,) At step 6, when the update statement is executed, T1 is rolled back because of T3->T4->T1. But for zheap, step 3 also creates a dependency T1->T3 because of in-place update. When T4 commits in step 4, it marks T3 as doomed because of T1 --> T3 --> T4. Hence, in step 5, T3 is rolled back. [1] Re: In-place updates and serializable transactions: https://www.postgresql.org/message-id/CAGz5QCJzreUqJqHeXrbEs6xb0zCNKBHhOj6D9Tjd3btJTzydxg%40mail.gmail.com -- Thanks & Regards, Kuntal Ghosh EnterpriseDB: http://www.enterprisedb.com
Re: Remove HeapTuple and Buffer dependency for predicate lockingfunctions
От
Heikki Linnakangas
Дата:
On 06/08/2019 07:20, Thomas Munro wrote: > On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <andres@anarazel.de> wrote: >> On 2019-08-05 20:58:05 +1200, Thomas Munro wrote: >>> 1. Commit dafaa3efb75 "Implement genuine serializable isolation >>> level." (2011) locked the root tuple, because it set t_self to *tid. >>> Possibly due to confusion about the effect of the preceding line >>> ItemPointerSetOffsetNumber(tid, offnum). >>> >>> 2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013) >>> fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self, >>> offnum). >> >> Hm. It's not at all sure that it's ok to report the non-root tuple tid >> here. I.e. I'm fairly sure there was a reason to not just set it to the >> actual tid. I think I might have written that up on the list at some >> point. Let me dig in memory and list. Obviously possible that that was >> also obsoleted by parallel changes. > > Adding Heikki and Kevin. > > I haven't found your earlier discussion about that yet, but would be > keen to read it if you can find it. I wondered if your argument > might have had something to do with heap pruning, but I can't find a > problem there. It's not as though the TID of any visible tuples > change, it's just that some dead stuff goes away and the root line > pointer is changed to LP_REDIRECT if it wasn't already. > > As for the argument for locking the tuple we emit rather than the HOT > root, I think the questions are: (1) How exactly do we get away with > locking only one version in a regular non-HOT update chain? (2) Is it > OK to do that with a HOT root? > > The answer to the first question is given in README-SSI[1]. > Unfortunately it doesn't discuss HOT directly, but I suspect the > answer is no, HOT is not special here. By my reading, it relies on > the version you lock being the version visible to your snapshot, which > is important because later updates have to touch that tuple to write > the next version. That doesn't apply to some arbitrarily older tuple > that happens to be a HOT root. Concretely, heap_update() does > CheckForSerializableConflictIn(relation, &oldtup, buffer), which is > only going to produce a rw conflict if T1 took an SIREAD on precisely > the version T2 locks in that path, not some arbitrarily older version > that happens to be a HOT root. A HOT root might never be considered > again by concurrent writers, no? Your analysis is spot on. Thanks for the clear write-up! >>> This must be in want of an isolation test, but I haven't yet tried to >>> get my head around how to write a test that would show the difference. >> >> Indeed. > > One practical problem is that the only way to reach > PredicateLockTuple() is from an index scan, and the index scan locks > the index page (or the whole index, depending on > rd_indam->ampredlocks). So I think if you want to see a serialization > anomaly you'll need multiple indexes (so that index page locks don't > hide the problem), a long enough HOT chain and then probably several > transactions to be able to miss a cycle that should be picked up by > the logic in [1]. I'm out of steam for this problem today though. I had some steam, and wrote a spec that reproduces this bug. It wasn't actually that hard to reproduce, fortunately. Or unfortunately; people might well be hitting it in production. I used the "freezetest.spec" from the 2013 thread as the starting point, and added one UPDATE to the initialization, so that the test starts with an already HOT-updated tuple. It should throw a serialization error, but on current master, it does not. After applying your fix.txt, it does. Your fix.txt seems correct. For clarity, I'd prefer moving things around a bit, though, so that the t_self is set earlier in the function, at the same place where the other HeapTuple fields are set. And set blkno and offnum together, in one ItemPointerSet call. With that, I'm not sure we need such a verbose comment explaining why t_self needs to be updated but I kept it for now. Attached is a patch that contains your fix.txt with the changes for clarity mentioned above, and an isolationtester test case. PS. Because heap_hot_search_buffer() now always sets heapTuple->t_self to the returned tuple version, updating *tid is redundant. And the call in heapam_index_fetch_tuple() wouldn't need to do "bslot->base.tupdata.t_self = *tid". - Heikki
Вложения
On Tue, Aug 6, 2019 at 9:26 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > I had some steam, and wrote a spec that reproduces this bug. It wasn't > actually that hard to reproduce, fortunately. Or unfortunately; people > might well be hitting it in production. I used the "freezetest.spec" > from the 2013 thread as the starting point, and added one UPDATE to the > initialization, so that the test starts with an already HOT-updated > tuple. It should throw a serialization error, but on current master, it > does not. After applying your fix.txt, it does. Thanks! Ahh, right, I was expecting it to be harder to see an undetected anomaly, because of the index page lock, but of course we never actually write to that page so it's just the heap tuple lock holding everything together. > Your fix.txt seems correct. For clarity, I'd prefer moving things around > a bit, though, so that the t_self is set earlier in the function, at the > same place where the other HeapTuple fields are set. And set blkno and > offnum together, in one ItemPointerSet call. With that, I'm not sure we > need such a verbose comment explaining why t_self needs to be updated > but I kept it for now. +1 > Attached is a patch that contains your fix.txt with the changes for > clarity mentioned above, and an isolationtester test case. LGTM. > PS. Because heap_hot_search_buffer() now always sets heapTuple->t_self > to the returned tuple version, updating *tid is redundant. And the call > in heapam_index_fetch_tuple() wouldn't need to do > "bslot->base.tupdata.t_self = *tid". Right, that sounds like a separate improvement for master only. -- Thomas Munro https://enterprisedb.com
Re: Remove HeapTuple and Buffer dependency for predicate lockingfunctions
От
Heikki Linnakangas
Дата:
On 06/08/2019 13:35, Thomas Munro wrote: > On Tue, Aug 6, 2019 at 9:26 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Attached is a patch that contains your fix.txt with the changes for >> clarity mentioned above, and an isolationtester test case. > > LGTM. Pushed, thanks! - Heikki
On Fri, Aug 2, 2019 at 4:56 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote:Looking at the code as of master, we currently have:Super awesome feedback and insights, thank you!- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
out a whether the tuple has been locked by the current
transaction. That check afaict just should be
TransactionIdIsCurrentTransactionId(), without all the other
stuff that's done today.Agree. v1-0002 patch attached does that now. Please let me know if that's what you meant.TransactionIdIsCurrentTransactionId() imo ought to be optimized to
always check for the top level transactionid first - that's a good bet
today, but even moreso for the upcoming AMs that won't have separate
xids for subtransactions. Alternatively we shouldn't make that a
binary search for each subtrans level, but just have a small
simplehash hashtable for xids.v1-0001 patch checks for GetTopTransactionIdIfAny() first in TransactionIdIsCurrentTransactionId() which seems yes better in general and more for future. That mostly meets the needs for current discussion.The alternative of not using binary search seems bigger refactoring and should be handled as separate optimization exercise outside of this thread.- CheckForSerializableConflictOut() wants to get the toplevel xid for
the tuple, because that's the one the predicate hashtable stores.
In your patch you've already moved the HTSV() call etc out of
CheckForSerializableConflictOut(). I'm somewhat inclined to think that
the SubTransGetTopmostTransaction() call ought to go along with that.
I don't really think that belongs in predicate.c, especially if
most/all new AMs don't use subtransaction ids.
The only downside is that currently the
TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
avoids the SubTransGetTopmostTransaction() check.
But again, the better fix for that seems to be to improve the generic
code. As written the check won't prevent a subtrans lookup for heap
when subtransactions are in use, and it's IME pretty common for tuples
to get looked at again in the transaction that has created them. So
I'm somewhat inclined to think that SubTransGetTopmostTransaction()
should have a fast-path for the current transaction - probably just
employing TransactionIdIsCurrentTransactionId().That optimization, as Kuntal also mentioned, seems something which can be done on-top afterwards on current patch.I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument. The relevant mapping should be one
line in the caller.Okay, I moved the sub transaction handling out of CheckForSerializableConflictOut() and have it along side HTSV() now.The reason I felt leaving subtransaction handling in generic place, was it might be premature to thing no future AM will need it. Plus, all serializable function api's having same expectations is easier. Like PredicateLockTuple() can be passed top or subtransaction id and it can handle it but with the change CheckForSerializableConflictOut() only be feed top transaction ID. But its fine and can see the point of AM needing it can easily get top transaction ID and feed it as heap.I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.Maybe, will give thought to it separate from the current patch.Minor notes:
- I don't think 'insert_xid' is necessarily great - it could also be the
updating xid etc. And while you can argue that an update is an insert
in the current heap, that's not the case for future AMs.
- to me
@@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
if (valid)
{
ItemPointerSetOffsetNumber(tid, offnum);
- PredicateLockTuple(relation, heapTuple, snapshot);
+ PredicateLockTID(relation, &(heapTuple)->t_self, snapshot,
+ HeapTupleHeaderGetXmin(heapTuple->t_data));
if (all_dead)
*all_dead = false;
return true;
What are those parens - as placed they can't do anything. Did you
intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
but it at least clarifies the precedence.Fixed. No idea what I was thinking there, mostly feel I intended to have it as like &(heapTuple->t_self).I'm also a bit confused why we don't need to pass in the offset of the
current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though, can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self).- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
portion of it's code as a static inline. In particular, it's a shame
that we currently perform external function calls at quite the
frequency when serializable isn't even in use.I am not sure on portion of the code part? SerializationNeededForRead() is static inline function in C file. Can't inline CheckForSerializableConflictOutNeeded() without moving SerializationNeededForRead() and some other variables to header file. CheckForSerializableConflictOut() wasn't inline either, so a function call was performed earlier as well when serializable isn't even in use.I understand that with refactor, HeapCheckForSerializableConflictOut() is called which calls CheckForSerializableConflictOutNeeded(). If that's the problem, for addressing the same, I had proposed alternative way to refactor. CheckForSerializableConflictOut() can take callback function and void* callback argument for AM specific check instead. So, the flow would be AM calling CheckForSerializableConflictOut() as today and only if serializable in use will invoke the callback to check with AM if more work should be performed or not. Essentially HeapCheckForSerializableConflictOut() will become callback function instead. Due to void* callback argument aspect I didn't like that solution and felt AM performing checks and calling CheckForSerializableConflictOut() seems more straight forward.
Attaching re-based version of the patches on top of current master, which has the fix for HOT serializable predicate locking bug spotted by Andres committed now.
Вложения
On Thu, Aug 8, 2019 at 6:53 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote: >>> - I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a >>> portion of it's code as a static inline. In particular, it's a shame >>> that we currently perform external function calls at quite the >>> frequency when serializable isn't even in use. >> >> I am not sure on portion of the code part? SerializationNeededForRead() is static inline function in C file. Can't inlineCheckForSerializableConflictOutNeeded() without moving SerializationNeededForRead() and some other variables to headerfile. CheckForSerializableConflictOut() wasn't inline either, so a function call was performed earlier as well whenserializable isn't even in use. I agree that it's strange that we do these high frequency function calls just to figure out that we're not even using this stuff, which ultimately comes down to the static global variable MySerializableXact being not reachable from (say) an inline function defined in a header. That's something to look into in another thread. > Attaching re-based version of the patches on top of current master, which has the fix for HOT serializable predicate lockingbug spotted by Andres committed now. I'm planning to commit these three patches on Monday. I've attached versions with whitespace-only changes from pgindent, and commit messages lightly massaged and updated to point to this discussion and reviewers.
Вложения
On Thu, Nov 7, 2019 at 8:44 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Thu, Aug 8, 2019 at 6:53 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
>>> - I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
>>> portion of it's code as a static inline. In particular, it's a shame
>>> that we currently perform external function calls at quite the
>>> frequency when serializable isn't even in use.
>>
>> I am not sure on portion of the code part? SerializationNeededForRead() is static inline function in C file. Can't inline CheckForSerializableConflictOutNeeded() without moving SerializationNeededForRead() and some other variables to header file. CheckForSerializableConflictOut() wasn't inline either, so a function call was performed earlier as well when serializable isn't even in use.
I agree that it's strange that we do these high frequency function
calls just to figure out that we're not even using this stuff, which
ultimately comes down to the static global variable MySerializableXact
being not reachable from (say) an inline function defined in a header.
That's something to look into in another thread.
> Attaching re-based version of the patches on top of current master, which has the fix for HOT serializable predicate locking bug spotted by Andres committed now.
I'm planning to commit these three patches on Monday. I've attached
versions with whitespace-only changes from pgindent, and commit
messages lightly massaged and updated to point to this discussion and
reviewers.
Thanks a lot, sounds good.
On Sat, Nov 9, 2019 at 8:41 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote: > On Thu, Nov 7, 2019 at 8:44 PM Thomas Munro <thomas.munro@gmail.com> wrote: >> I'm planning to commit these three patches on Monday. I've attached >> versions with whitespace-only changes from pgindent, and commit >> messages lightly massaged and updated to point to this discussion and >> reviewers. > > Thanks a lot, sounds good. Hi Ashwin, I pushed the first two, but on another read-through of the main patch I didn't like the comments for CheckForSerializableConflictOut() or the fact that it checks SerializationNeededForRead() again, after I thought a bit about what the contract for this API is now. Here's a version with small fixup that I'd like to squash into the patch. Please let me know what you think, or if you see how to improve it further.
Вложения
On Sun, Nov 10, 2019 at 8:21 PM Thomas Munro <thomas.munro@gmail.com> wrote:
I pushed the first two,
Thank You!
but on another read-through of the main patch
I didn't like the comments for CheckForSerializableConflictOut() or
the fact that it checks SerializationNeededForRead() again, after I
thought a bit about what the contract for this API is now. Here's a
version with small fixup that I'd like to squash into the patch.
Please let me know what you think,
inside CheckForSerializableConflictOut() is to keep that API clean and
complete by itself. Only if AM like heap needs to perform some AM
specific checking only for serialization needed case can do so but is
not forced. So, if AM for example Zedstore doesn't need to do any
specific checking, then it can directly call
CheckForSerializableConflictOut(). With the modified fixup patch, the
responsibility is unnecessarily forced to caller even if
CheckForSerializableConflictOut() can do it. I understand the intent
is to avoid duplicate check for heap.
or if you see how to improve it
further.
I had proposed as alternative way in initial email and also later,
didn't receive comment on that, so re-posting.
Alternative way to refactor. CheckForSerializableConflictOut() can
take callback function and (void *) callback argument for AM specific
check. So, the flow would be AM calling
CheckForSerializableConflictOut() as today and only if
SerializationNeededForRead() will invoke the callback to check with AM
if more work should be performed or not. Essentially
HeapCheckForSerializableConflictOut() will become callback function
instead. So, roughly would look like....
typedef bool (*AMCheckForSerializableConflictOutCallback) (void *arg);
void CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot, AMCheckForSerializableConflictOutCallback callback, void *callback_arg)
{
if (!SerializationNeededForRead(relation, snapshot))
return;
if (callback != NULL && !callback(callback_args))
return;
........
.....
}
With this AMs which don't have any specific checks to perform can pass
callback as NULL. So, function call is involved only if
SerializationNeededForRead() and only for AMs which need it.
Just due to void* callback argument aspect I didn't prefer that
solution and felt AM performing checks and calling
CheckForSerializableConflictOut() seems better. Please let me know
how you feel about this.
didn't receive comment on that, so re-posting.
Alternative way to refactor. CheckForSerializableConflictOut() can
take callback function and (void *) callback argument for AM specific
check. So, the flow would be AM calling
CheckForSerializableConflictOut() as today and only if
SerializationNeededForRead() will invoke the callback to check with AM
if more work should be performed or not. Essentially
HeapCheckForSerializableConflictOut() will become callback function
instead. So, roughly would look like....
typedef bool (*AMCheckForSerializableConflictOutCallback) (void *arg);
void CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot, AMCheckForSerializableConflictOutCallback callback, void *callback_arg)
{
if (!SerializationNeededForRead(relation, snapshot))
return;
if (callback != NULL && !callback(callback_args))
return;
........
.....
}
With this AMs which don't have any specific checks to perform can pass
callback as NULL. So, function call is involved only if
SerializationNeededForRead() and only for AMs which need it.
Just due to void* callback argument aspect I didn't prefer that
solution and felt AM performing checks and calling
CheckForSerializableConflictOut() seems better. Please let me know
how you feel about this.
On Wed, Nov 13, 2019 at 7:27 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote: > On Sun, Nov 10, 2019 at 8:21 PM Thomas Munro <thomas.munro@gmail.com> wrote: >> but on another read-through of the main patch >> I didn't like the comments for CheckForSerializableConflictOut() or >> the fact that it checks SerializationNeededForRead() again, after I >> thought a bit about what the contract for this API is now. Here's a >> version with small fixup that I'd like to squash into the patch. >> Please let me know what you think, > > The thought or reasoning behind having SerializationNeededForRead() > inside CheckForSerializableConflictOut() is to keep that API clean and > complete by itself. Only if AM like heap needs to perform some AM > specific checking only for serialization needed case can do so but is > not forced. So, if AM for example Zedstore doesn't need to do any > specific checking, then it can directly call > CheckForSerializableConflictOut(). With the modified fixup patch, the > responsibility is unnecessarily forced to caller even if > CheckForSerializableConflictOut() can do it. I understand the intent > is to avoid duplicate check for heap. OK, I kept only the small comment change from that little fixup patch, and pushed this. > I had proposed as alternative way in initial email and also later, > didn't receive comment on that, so re-posting. > typedef bool (*AMCheckForSerializableConflictOutCallback) (void *arg); ... > Just due to void* callback argument aspect I didn't prefer that > solution and felt AM performing checks and calling > CheckForSerializableConflictOut() seems better. Please let me know > how you feel about this. Yeah. We could always come back to this idea if it looks better once we have more experience with new table AMs.
On Mon, Jan 27, 2020 at 4:47 PM Thomas Munro <thomas.munro@gmail.com> wrote:
OK, I kept only the small comment change from that little fixup patch,
and pushed this.
> I had proposed as alternative way in initial email and also later,
> didn't receive comment on that, so re-posting.
> typedef bool (*AMCheckForSerializableConflictOutCallback) (void *arg);
...
> Just due to void* callback argument aspect I didn't prefer that
> solution and felt AM performing checks and calling
> CheckForSerializableConflictOut() seems better. Please let me know
> how you feel about this.
Yeah. We could always come back to this idea if it looks better once
we have more experience with new table AMs.
Sounds good. Thank You!