Обсуждение: Re: Issue with markers in isolation tester? Or not?

Поиск
Список
Период
Сортировка

Re: Issue with markers in isolation tester? Or not?

От
Noah Misch
Дата:
On Mon, Jan 13, 2025 at 02:02:00AM +0100, Michail Nikolaev wrote:
> While stabilizing tests for [0] I have noticed unclear (and confusing in my
> opinion) behavior of markers in the isolation tester.
> 
> I have attached a test with reproducer.
> 
> There are two permutations in the test:
> 
> permutation
>       after(before)
>       before
>       detach1
>       wakeup1
>       detach2
>       wakeup2
> 
> In that case, I see expected results:
>       step before: <... completed>
>       step after: <... completed>
> 
> But changing the order of steps slightly:
> 
> permutation
>        after(before)
>        wakeup1  <------- wakeup moved here
>        before
>        detach1
>        detach2
>        wakeup2
> 
> makes "after" to be completed before "before". Does that make sense?

Yes.  I don't see a good reason for isolationtester to disregard that
"(before)" marker, so I think you've found a bug in isolationtester.  How
might we fix it?



Re: Issue with markers in isolation tester? Or not?

От
Noah Misch
Дата:
On Tue, Jan 14, 2025 at 12:16:22PM +0100, Michail Nikolaev wrote:
> --- a/src/test/isolation/README
> +++ b/src/test/isolation/README
> @@ -183,9 +183,9 @@ A marker consisting solely of a step name indicates that this step may
>  not be reported as completing until that other step has completed.
>  This allows stabilizing cases where two queries might be seen to complete
>  in either order.  Note that this step can be *launched* before the other
> -step has completed.  (If the other step is used more than once in the
> -current permutation, this step cannot complete while any of those
> -instances is active.)
> +step has completed or launched.  (If the other step is used more than once
> +in the current permutation, this step cannot complete while any of those
> +instances are not yet complete.)

I misunderstood, and I was mistaken to see this as a bug fix.  The
isolationtester is acting per its definition, and this would be a definition
change.  Do others have opinions on the merits of today's definition vs. the
proposed definition?

I'd need to review the motivating test to form my own opinion on whether the
new definition makes it easier to write tests.  I'm optimistic that it does
make things easier, because I think one can get the old behavior by defining
two steps with the same commands.  Thanks for the patch.  You could bundle
this in the thread that wants this to stabilize that thread's CI results.

As a side note on the broader topic of the difficulty of stabilizing isolation
test race conditions, I've thought it would be helpful to have a "strict
blocking condition mode" that complains at the end of the test run if the spec
had completion order ambiguities.  I think these are the rules sufficient to
have a fully-specified completion order:

1. Every step that waits needs a blocking condition.
2. No two steps A & B name the same step C as a blocking condition.  Instead,
   make B block on A.
3. If a step appears in another step's blocking condition and is not the last
   step listed in the permutation, the step listed after the unblocker needs a
   blocking condition on a formerly-blocked step.

> +session s1
> +step after    { SELECT injection_points_run('injection-points-wait-1'); }

FYI, I prefer the convention of ending each step name with the session number,
i.e. s/after/after1/ here.  The first isolation specs used that style.  I
mildly prefer it over the newer alternative styles, and I strongly prefer
styles that convey the session identity over styles that don't.  No need to
send a new patch version, though.



Re: Issue with markers in isolation tester? Or not?

От
Michail Nikolaev
Дата:
Hello, Noah!

> I misunderstood, and I was mistaken to see this as a bug fix.  The
>  isolationtester is acting per its definition, and this would be a definition
>  change.

Yes, you are right, but I think it is better to clarify it somehow because from my point of view that definition feels like logic in patched version.
But maybe my non-native English prevents me from understanding it correctly.

In my opinion, it is better to add some clarification like "this marker affects only other steps which were launched before this step or during its execution".

> I'd need to review the motivating test to form my own opinion on whether the
> new definition makes it easier to write tests.
> You could bundle this in the thread that wants this to stabilize that thread's CI results.

I'll think about it, but currently it is stabilized using "notices" and, you know, more changes - harder to merge :)

Also, I'll try one more time to stabilize it without "notices" - maybe I'll get some insight.
Anyway - I'll ping you in that thread [0] (you're already in recipients) or bring it here.

Best regards,
Mikhail.

Re: Issue with markers in isolation tester? Or not?

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Tue, Jan 14, 2025 at 11:48:28AM -0800, Noah Misch wrote:
>> I misunderstood, and I was mistaken to see this as a bug fix.  The
>> isolationtester is acting per its definition, and this would be a definition
>> change.  Do others have opinions on the merits of today's definition vs. the
>> proposed definition?

> I agree that Michail's case with the handling of the markers is kind
> of strange in the case he has reported.  Anyway, do you think that it
> would be a good idea to change that knowing for how long we've relied
> on isolationtester to do things the way they are?

We've changed it before, so I'm not totally averse to considering
more change.  But I'm not happy about the lack of any analysis in
this thread of the impact on existing test cases.  In particular,
it seems to me that this proposal amounts to reducing the number
of states that can be reached during an isolation test.  So it
seems to me that means strictly less coverage than we had before.
I'd prefer to start by asking why the proposed new test can't be
made stable without such a restriction.

(I don't mean to minimize the question of whether we need better
documentation about this behavior.)

            regards, tom lane



Re: Issue with markers in isolation tester? Or not?

От
Noah Misch
Дата:
On Wed, Jan 15, 2025 at 02:41:31PM +0900, Michael Paquier wrote:
> On Tue, Jan 14, 2025 at 11:48:28AM -0800, Noah Misch wrote:
> > I misunderstood, and I was mistaken to see this as a bug fix.  The
> > isolationtester is acting per its definition, and this would be a definition
> > change.  Do others have opinions on the merits of today's definition vs. the
> > proposed definition?
> 
> I agree that Michail's case with the handling of the markers is kind
> of strange in the case he has reported.  Anyway, do you think that it
> would be a good idea to change that knowing for how long we've relied
> on isolationtester to do things the way they are?

I think that would depend on what proposed tests become easier.  Since the
original poster used the "notices" marker type to solve the motivating use
case, let's shelve this until there's new demand.

If we ever unshelve this and find much in today's specs that would malfunction
under the new definition, we might offer both definitions.  A new marker
syntax would reach the new definition, and old specs would work as before.

> If we do it only on
> HEAD, it would make the back-patching of newly-added tests for bug
> fixes potentially more complex to deal with.  If applying the new
> definition across all the stable branches, this could impact something
> outside code :/

When I'm exposed to non-core tests, those tests usually target many major
versions simultaneously.  Having the spec meaning vary across major versions
would hurt them more than it would help.  Hence, back-patch if we do adopt the
change.  As above, the impact on existing specs could inform whether we add
syntax vs. redefine existing syntax.



Re: Issue with markers in isolation tester? Or not?

От
Michail Nikolaev
Дата:
Hello, everyone!

I was able to stabilize (I hope so) tests on [0] without any changes to isolationtester and without "notices N".

So, I agree, it is better to put it on a shelf now. But a few words in documentation may be a good idea.

Best regards,
Mikhail.

Re: Issue with markers in isolation tester? Or not?

От
Michail Nikolaev
Дата:
Hello, Michael!

> Could you summarize here what you have done to achieve test
>  stabilization in your new patch set posted at [1] without using the
>  proposal of this thread?

Mostly idea is next:

Let's imagine we have two steps - step_before and step_after which may end in either order.
Then instead of such step/markers structure:

        step_before(step_after)
        stepN
        stepN+1
        step_after

use the next:

        step_before
        stepN
        stepN+1
        step_after(step_before)


In the first case, there are two possible results:
1) step_before is finished before step_after - reported as step_before, step_after
2) step_after is launched before step_before ends - reported as step_after, step_before

But in the case second variant:
1) step_before is finished before step_after - reported as step_before, step_after
2) step_after is launched before step_before ends - reported as step_before, step_after

So, the second option provides the same result regardless of order of finishing of step_before and step_after, which is the thing I want to achieve here.

Best regards,
Mikhail.

Re: Issue with markers in isolation tester? Or not?

От
Noah Misch
Дата:
On Sun, Jan 19, 2025 at 07:26:58PM +0100, Michail Nikolaev wrote:
> Mostly idea is next:
> 
> Let's imagine we have two steps - step_before and step_after which may end
> in either order.
> Then instead of such step/markers structure:
> 
>         step_before(step_after)
>         stepN
>         stepN+1
>         step_after
> 
> use the next:
> 
>         step_before
>         stepN
>         stepN+1
>         step_after(step_before)
> 
> In the first case, there are two possible results:
> 1) step_before is finished before step_after - reported as step_before,
> step_after
> 2) step_after is launched before step_before ends - reported as step_after,
> step_before
> 
> But in the case second variant:
> 1) step_before is finished before step_after - reported as step_before,
> step_after
> 2) step_after is launched before step_before ends - reported as
> step_before, step_after
> 
> So, the second option provides the same result regardless of order of
> finishing of step_before and step_after, which is the thing I want to
> achieve here.

Stepping back, any variation in the total ordering of step-start and
step-complete events necessarily changes the expected output.  Hence, we want
to freeze that ordering.  We could do that directly, by having isolationtester
read a format that lists start and complete events.  We could also have the
isolationtester write the same format, so you could draft a permutation
without blocking conditions and just save the order that isolationtester
discovers by experiment.  I'd find that less mentally taxing than following
rules to reach a frozen ordering via blocking conditions.

As an example of one concrete way to achieve that, have isolationtester write
a results/SPECNAME.wait file that contains syntax like "permutation-event
waiting:S1 S2 complete:S1 S3".  You'd copy that file to
expected/SPECNAME.wait, which isolationtester would read at startup.  This
wouldn't replace "notices" blocking conditions for certain complex tests, and
tests with multiple expected outputs might not want it.  It would cover the
rest.  Any unprefixed entry (e.g. the plain "S2") would get the simplification
of skipping the check for whether the step has entered a wait.

How well would that meet your test's needs?



Re: Issue with markers in isolation tester? Or not?

От
Michail Nikolaev
Дата:
Hello, Noah!

> Stepping back, any variation in the total ordering of step-start and
>  step-complete events necessarily changes the expected output.  Hence, we want
>  to freeze that ordering.  We could do that directly, by having isolationtester
>  read a format that lists start and complete events.  We could also have the
>  isolationtester write the same format, so you could draft a permutation
>  without blocking conditions and just save the order that isolationtester
>  discovers by experiment.  I'd find that less mentally taxing than following
>  rules to reach a frozen ordering via blocking conditions.

> How well would that meet your test's needs?

It feels a bit strange to me to force isolationtest to output something in a specific order, especially since we don't actually care about the order in the test.

But why do we need that output at all? I think a simpler solution would be to mark a step in a way that prevents isolationtester from outputting detailed information about the completion of that step—just an overview of the steps launched and completed at the end of the permutation.

In my tests, I evaluate how [CREATE|RE] INDEX CONCURRENTLY interferes with UPSERTs. I’m not testing the exact ordering; I simply set the backends into a specific state (with different index information) and let them proceed one by one. I don’t care which one finishes first or last—I’m only interested in identifying errors in the output.

Best regards,
Mikhail.

Re: Issue with markers in isolation tester? Or not?

От
Noah Misch
Дата:
On Sun, Jan 19, 2025 at 09:55:32PM +0100, Michail Nikolaev wrote:
> > Stepping back, any variation in the total ordering of step-start and
> >  step-complete events necessarily changes the expected output.  Hence, we
> want
> >  to freeze that ordering.  We could do that directly, by having
> isolationtester
> >  read a format that lists start and complete events.  We could also have
> the
> >  isolationtester write the same format, so you could draft a permutation
> >  without blocking conditions and just save the order that isolationtester
> >  discovers by experiment.  I'd find that less mentally taxing than
> following
> >  rules to reach a frozen ordering via blocking conditions.
> 
> > How well would that meet your test's needs?
> 
> It feels a bit strange to me to force isolationtest to output something in
> a specific order, especially since we don't actually care about the order
> in the test.
> 
> But why do we need that output at all?

It's good to think of ways we can output differently for more stability.
However, ordered output helps the human studying what happened.  Today's
output is well-optimized for that.  Today's output resembles what a person
would see if running a collection of psql sessions.  For example, if I'm
reviewing a patch, reviewing the expected spec output feels natural.

> I think a simpler solution would be
> to mark a step in a way that prevents isolationtester from outputting
> detailed information about the completion of that step—just an overview of
> the steps launched and completed at the end of the permutation.

My main concern about that is the difficulty of marking steps:

- Best: no markings
- Medium: easy rules for when you need a marking
- Worst: run on FreeBSD to determine marking requirements by experiment

I've not been able to identify an easy rule that avoids "run on FreeBSD" with
today's framework.  (I have a non-easy list of four rules, but I'm not
confident even that list is complete.)  I can't think of an easy rule for
"just an overview" markings, either.  Even if there is one, not needing
markings is better, all else being equal.  What other advantages may
compensate for this concern?

There's another, minor concern.  The more "just an overview" markings in a
spec, the more reading its expected output wouldn't feel like reading psql
transcripts.  I'd likely end up logging both today's ordered format and the
new format.  We'd diff the new format only, and we'd write the old format for
human debugging.

> In my tests, I evaluate how [CREATE|RE] INDEX CONCURRENTLY interferes with
> UPSERTs. I’m not testing the exact ordering; I simply set the backends into
> a specific state (with different index information) and let them proceed
> one by one. I don’t care which one finishes first or last—I’m only
> interested in identifying errors in the output.

That makes sense.

Thanks,
nm