Обсуждение: "Triggered data change violation", once again
I have been looking at the way that deferred triggers slow down when the same row is updated multiple times within a transaction. The problem appears to be entirely due to calling deferredTriggerGetPreviousEvent() to find the trigger list entry for the previous update of the row: we do a linear search, so the behavior is roughly O(N^2) when there are N updated rows. The only reason we do this is to enforce the "triggered data change violation" restriction of the spec. However, I think we've misinterpreted the spec. The code prevents an RI referenced value from being changed more than once in a transaction, but what the spec actually says is thou shalt not change it more than once per *statement*. We have discussed this several times in the past and I think people have agreed that the current behavior is wrong, but nothing's been done about it. I think all we need to do to implement things correctly is to consider a previous event only if both xmin and cmin of the old tuple match the current xact & command IDs, rather than considering it on the basis of xmin alone. Aside from being correct, this will make a significant difference in performance. If we were doing it per spec then deferredTriggerGetPreviousEvent would never be called in typical operations, and so its speed wouldn't be an issue. Moreover, if we do it per spec then completed trigger event records could be removed from the trigger list at end of statement, rather than keeping them till end of transaction, which'd save memory space. Comments? regards, tom lane
On Wed, 24 Oct 2001, Tom Lane wrote: > The only reason we do this is to enforce the "triggered data change > violation" restriction of the spec. However, I think we've > misinterpreted the spec. The code prevents an RI referenced value from > being changed more than once in a transaction, but what the spec > actually says is thou shalt not change it more than once per > *statement*. We have discussed this several times in the past and > I think people have agreed that the current behavior is wrong, > but nothing's been done about it. > > I think all we need to do to implement things correctly is to consider a > previous event only if both xmin and cmin of the old tuple match the > current xact & command IDs, rather than considering it on the basis of > xmin alone. Are there any things that might update the command ID during the execution of the statement from inside functions that are being run? I really don't understand the details of how that works (which is the biggest reason I haven't yet tackled some of the big remaining broken stuff in the referential actions, because AFAICT we need to be able to update a row that matched at the beginning of the statement, not the ones that match at the time the triggers run).
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes: >> I think all we need to do to implement things correctly is to consider a >> previous event only if both xmin and cmin of the old tuple match the >> current xact & command IDs, rather than considering it on the basis of >> xmin alone. > Are there any things that might update the command ID during the execution > of the statement from inside functions that are being run? Functions can run new commands that get new command ID numbers within the current transaction --- but on return from the function, the current command number is restored. I believe rows inserted by such a function would look "in the future" to us at the outer command, and would be ignored. Actually, now that I think about it, the MVCC rules are that tuples with xmin = currentxact are not visible unless they have cmin < currentcmd. Not equal to. This seems to render the entire "triggered data change" test moot --- I rather suspect that we cannot have such a condition as old tuple cmin = currentcmd at all, and so we could just yank all that code entirely. regards, tom lane
On Wed, 24 Oct 2001, Tom Lane wrote: > Stephan Szabo <sszabo@megazone23.bigpanda.com> writes: > >> I think all we need to do to implement things correctly is to consider a > >> previous event only if both xmin and cmin of the old tuple match the > >> current xact & command IDs, rather than considering it on the basis of > >> xmin alone. > > > Are there any things that might update the command ID during the execution > > of the statement from inside functions that are being run? > > Functions can run new commands that get new command ID numbers within > the current transaction --- but on return from the function, the current > command number is restored. I believe rows inserted by such a function > would look "in the future" to us at the outer command, and would be > ignored. > > Actually, now that I think about it, the MVCC rules are that tuples with > xmin = currentxact are not visible unless they have cmin < currentcmd. > Not equal to. This seems to render the entire "triggered data change" > test moot --- I rather suspect that we cannot have such a condition > as old tuple cmin = currentcmd at all, and so we could just yank all > that code entirely. I'm not sure if this sequence would be an example of something that would be disallowed, but if I do something like: Make a plpgsql function that does update table1 set key=1 where key=2; Make that an after update trigger on table1 Put a key=1 row into table1 Update table1 to set key to 2 I end up with a 1 in the table. I'm not sure, but I think that such a case would be possible through the fk stuff with triggers that modify the primary key table (right now it might "work" due to the problems of checking intermediate states). Wouldn't this be the kind of thing the "triggered data change" is supposed to prevent? I may be just misunderstanding the intent of the spec.
Tom Lane wrote: > > Stephan Szabo <sszabo@megazone23.bigpanda.com> writes: > >> I think all we need to do to implement things correctly is to consider a > >> previous event only if both xmin and cmin of the old tuple match the > >> current xact & command IDs, rather than considering it on the basis of > >> xmin alone. > > > Are there any things that might update the command ID during the execution > > of the statement from inside functions that are being run? > > Functions can run new commands that get new command ID numbers within > the current transaction --- but on return from the function, the current > command number is restored. I believe rows inserted by such a function > would look "in the future" to us at the outer command, and would be > ignored. I'm suspicious if this is reasonable. If those changes are ignored when are taken into account ? ISTM deferred constraints has to see the latest tuples and take the changes into account. regards, Hiroshi Inoue