Обсуждение: Re: Referential Integrity In PostgreSQL
>
> Hi , Jan
>
> my name is Max .
Hi Max,
>
> I have contributed to SPI interface ,
> that with external Trigger try to make
> a referential integrity.
>
> If I can Help , in something ,
> I'm here .
>
You're welcome.
I've CC'd the hackers list because we might get some ideas
from there too (and to surface once in a while - Bruce
already missed me).
Currently I'm very busy for serious work so I don't find
enough spare time to start on such a big change to
PostgreSQL. But I'd like to give you an overview of what I
have in mind so far so you can decide if you're able to help.
Referential integrity (RI) is based on constraints defined in
the schema of a database. There are some different types of
constraints:
1. Uniqueness constraints.
2. Foreign key constraints that ensure that a key value used
in an attribute exists in another relation. One
constraint must ensure you're unable to INSERT/UPDATE to
a value that doesn't exist, another one must prevent
DELETE on a referenced key item or that it is changed
during UPDATE.
3. Cascading deletes that let rows referring to a key follow
on DELETE silently.
Even if not defined in the standard (AFAIK) there could be
others like letting references automatically follow on UPDATE
to a key value.
All constraints can be enabled and/or default to be deferred.
That means, that the RI checks aren't performed when they are
triggerd. Instead, they're checked at transaction end or if
explicitly invoked by some special statement. This is really
important because someone must be able to setup cyclic RI
checks that could never be satisfied if the checks would be
performed immediately. The major problem on this is the
amount of data affected until the checks must be performed.
The number of statements executed, that trigger such deferred
constraints, shouldn't be limited. And one single
INSERT/UPDATE/DELETE could affect thousands of rows.
Due to these problems I thought, it might not be such a good
idea to remember CTID's or the like to get back OLD/NEW rows
at the time the constraints are checked. Instead I planned to
misuse the rule system for it. Unfortunately, the rule system
has damned tricky problems itself when it comes to having-,
distinct and other clauses and extremely on aggregates and
subselects. These problems would have to get fixed first. So
it's a solution that cannot be implemented right now.
Fallback to CTID remembering though. There are problems too
:-(. Let's enhance the trigger mechanism with a deferred
feature. First this requires two additional bool attributes
in the pg_trigger relation that tell if this trigger is
deferrable and if it is deferred by default. While at it we
should add another bool that tells if the trigger is enabled
(ALTER TRIGGER {ENABLE|DISABLE} trigger).
Second we need an internal list of triggers, that are
currently DEFINED AS DEFERRED. Either because they default to
it, or the user explicitly asked to deferr it.
Third we need an internal list of triggers that must be
invoked later because at the time an event occured where they
should have been triggered, they appeared in the other list
and their execution is delayed until transaction end or
explicit execution. This list must remember the OID of the
trigger to invoke (to identify the procedure and the
arguments), the relation that caused the trigger and the
CTID's of the OLD and NEW row.
That last list could grow extremely! Think of a trigger
that's executing commands over SPI which in turn activate
deferred triggers. Since the order of trigger execution is
very important for RI, I can't see any chance to
simplify/condense this information. Thus it is 16 bytes at
least per deferred trigger call (2 OID's plus 2 CTID's). I
think one or more temp files would fit best for this.
A last tricky point is if one of a bunch of deferred triggers
is explicitly called for execution. At this time, the entries
for it in the temp file(s) must get processed and marked
executed (maybe by overwriting the triggers OID with the
invalid OID) while other trigger events still have to get
recorded.
Needless to say that reading thousands of those entries just
to find a few isn't good on performance. But better have this
special case slow that dealing with hundreds of temp files or
other overhead slowing down the usual case where ALL deferred
triggers get called at transaction end.
Trigger invocation is simple now - fetch the OLD and NEW rows
by CTID and execute the trigger as done by the trigger
manager. Oh - well - vacuum shouldn't touch relations where
deferred triggers are outstanding. Might require some
special lock entry - Vadim?
Did I miss something?
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck@debis.com (Jan Wieck) #
Jan Wieck wrote: > > Third we need an internal list of triggers that must be > invoked later because at the time an event occured where they > should have been triggered, they appeared in the other list > and their execution is delayed until transaction end or > explicit execution. This list must remember the OID of the > trigger to invoke (to identify the procedure and the > arguments), the relation that caused the trigger and the > CTID's of the OLD and NEW row. > > That last list could grow extremely! Think of a trigger > that's executing commands over SPI which in turn activate > deferred triggers. Since the order of trigger execution is > very important for RI, I can't see any chance to > simplify/condense this information. Thus it is 16 bytes at > least per deferred trigger call (2 OID's plus 2 CTID's). I > think one or more temp files would fit best for this. > > A last tricky point is if one of a bunch of deferred triggers > is explicitly called for execution. At this time, the entries > for it in the temp file(s) must get processed and marked > executed (maybe by overwriting the triggers OID with the > invalid OID) while other trigger events still have to get > recorded. I believe that things are much simpler. For each deferable constraint (trigger) we have to remember the LastCommandIdProccessedByConstraint. When the mode of a constraint changes from defered to immediate (SET CONSTRAINT MODE), modified tuple will be fetched from WAL from down to up until tuple modified by LastCommandIdProccessedByConstraint is fetched and this is show stopper. Now we remember CommandId of SET CONSTRAINT MODE as new LastCommandIdProccessedByConstraint. When LastCommandIdProccessedByConstraint is changed by SET CONSTRAINT MODE DEFERRED we remeber this in flag to update LastCommandIdProccessedByConstraint later with higher CommandId of first modification of triggered table (to reduce amount of data to read from WAL). ? Vadim
Vadim wrote:
> I believe that things are much simpler.
> For each deferable constraint (trigger) we have to remember
> the LastCommandIdProccessedByConstraint. When the mode of
> a constraint changes from defered to immediate (SET CONSTRAINT MODE),
> modified tuple will be fetched from WAL from down to up until
> tuple modified by LastCommandIdProccessedByConstraint is fetched
> and this is show stopper. Now we remember CommandId of
> SET CONSTRAINT MODE as new LastCommandIdProccessedByConstraint.
> When LastCommandIdProccessedByConstraint is changed by
> SET CONSTRAINT MODE DEFERRED we remeber this in flag to
> update LastCommandIdProccessedByConstraint later with higher
> CommandId of first modification of triggered table (to reduce
> amount of data to read from WAL).
Hmmm,
I'm not sure what side effects it could have if the triggers
at the time of
SET CONSTRAINTS c1, c2 IMMEDIATE
arent fired in the same order they have been recorded - must
think about that for a while. In that case I must be able to
scan WAL from one command ID until another regardless of the
resultrelation. Is that possible?
Another issue is this: isn't it possible to run a database
(or maybe an entire installation) without WAL? Does it make
the code better maintainable to have WAL and RI coupled that
strong?
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck@debis.com (Jan Wieck) #
Jan Wieck wrote:
>
> I'm not sure what side effects it could have if the triggers
> at the time of
>
> SET CONSTRAINTS c1, c2 IMMEDIATE
>
> arent fired in the same order they have been recorded - must
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Did you mean - in the same order as tables were modified?
> think about that for a while. In that case I must be able to
> scan WAL from one command ID until another regardless of the
> resultrelation. Is that possible?
WAL records are in the same order as tuples (regardless of
result relation) was modified. Example: UPDATE of T1
fires (immediate) after row trigger inserting tuple
into T2. WAL records:
--> up
{old_T1_tuple version ID, new_T1_tuple version ID and values}
{new_T2_tuple ID and values}
...
T1 update record
T2 insert record
...
--> down
But records will be fetched from WAL in reverse order, from
down to up.
Does it matter?
Order of modifications made by UPDATE/DELETE is undefined.
Though, order has some sence for INSERT ... SELECT ORDER BY -:)
Nevertheless, I don't see in standard anything about order
of constraint checks.
BTW, I found what standard means by "immediate":
--- The checking of a constraint depends on its constraint mode within the current SQL-transaction. If the constraint
modeis immediate,
| then the constraint is effectively checked at the end of each
^^^^^^^^^^^^^^^^^^
| ___________________________________________________________________
| ANSI Only-SQL3
| ___________________________________________________________________
| SQL-statement S, unless S is executed because it is a <triggered ^^^^^^^^^^^^^^^
| SQL statement>, in which case, the constraint is effectively
| checked at the end of the SQL-statement that is the root cause
| of S.
---
And now about triggers (regardless of ROW or STATEMENT level!):
--- 4.22.2 Execution of triggered actions
The execution of triggered actions depends on the cursor mode of the current SQL-transaction. If the cursor mode
isset to cascade off, then the execution of the <triggered SQL statement>s is effec- tively deferred until enacted
implicitlybe execution of a <commit statement> or a <close statement>. Otherwise, the <triggered SQL statement>s are
effectivelyexecuted either before or after the ^^^^^^^^^^^^^^^^^^^
executionof each SQL-statement, as determined by the specified ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ <trigger action
time>.
---
Hm.
> Another issue is this: isn't it possible to run a database
> (or maybe an entire installation) without WAL? Does it make
Do you worry about disk space? -:)
With archive mode off only log segments (currently, 64M each)
required by active transactions (which made some changes)
will present on disk.
> the code better maintainable to have WAL and RI coupled that
> strong?
This doesn't add any complexity to WAL manager.
Vadim
Vadim wrote:
> But records will be fetched from WAL in reverse order, from
> down to up.
>
> Does it matter?
Might require to teach the WAL-manager to do it top-down too.
And even then it might be better on performance to scan my
constraint-log for events to the same tuple. It has records
of a fixed, very small size and fetching tuples by CTID from
the heap (direct block access) is required anyway because for
delayed trigger invocation I neen OLD values too - and that's
not in WAL if I read it right.
But as I said I'd like to leave that coupling for later.
> BTW, I found what standard means by "immediate":
> ---
> The checking of a constraint depends on its constraint mode within
> the current SQL-transaction. If the constraint mode is immediate,
> | then the constraint is effectively checked at the end of each
> ^^^^^^^^^^^^^^^^^^
> | ___________________________________________________________________
> | ANSI Only-SQL3
> | ___________________________________________________________________
> | SQL-statement S, unless S is executed because it is a <triggered
> ^^^^^^^^^^^^^^^
> | SQL statement>, in which case, the constraint is effectively
> | checked at the end of the SQL-statement that is the root cause
> | of S.
> ---
Ah - so ALL constraint-triggers must be AFTER <event> and
deferred at least until the end of the USER-query.
>
> And now about triggers (regardless of ROW or STATEMENT level!):
> ---
> 4.22.2 Execution of triggered actions
>
> The execution of triggered actions depends on the cursor mode of
> the current SQL-transaction. If the cursor mode is set to cascade
> off, then the execution of the <triggered SQL statement>s is effec-
> tively deferred until enacted implicitly be execution of a <commit
> statement> or a <close statement>. Otherwise, the <triggered SQL
> statement>s are effectively executed either before or after the
> ^^^^^^^^^^^^^^^^^^^
> execution of each SQL-statement, as determined by the specified
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> <trigger action time>.
> ---
We do not have FOR UPDATE cursors. So (even if to be kept in
mind) there is no CURSOR mode to care for right now.
Changing BEFORE triggers to behave exactly like that would
require to do the execution of the plan twice, one time to
fire triggers, another time to perform the action itself. I
don't think that the perfomance cost is worth this little
amount of accuracy. Such a little difference should be
mentioned in the product notes and period.
AFTER triggers could simply be treated half like IMMEDIATE
constraints - deferred until the end of a single statement
(not user-query). So there are four times where the deferred
trigger queue is run (maybe partially). At the end of a
statement, end of a user-query, at a syncpoint (not sure if
we have them up to now) and end of transaction.
Things are getting much clearer - Tnx.
>
> Do you worry about disk space? -:)
> With archive mode off only log segments (currently, 64M each)
> required by active transactions (which made some changes)
> will present on disk.
Never - my motto is "don't force it - use a bigger hammer".
But the above seems to be exactly like the Oracle behaviour
where online-redolog's aren't affected by archive mode.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck@debis.com (Jan Wieck) #