Обсуждение: Missing import in 035_standby_logical_decoding.pl

Поиск
Список
Период
Сортировка

Missing import in 035_standby_logical_decoding.pl

От
Melanie Plageman
Дата:
Hi,

035_standby_logical_decoding.pl has this code

# wait for postgres to terminate
foreach my $i (0 .. 10 * $PostgreSQL::Test::Utils::timeout_default)
{
    last if !-f $node_standby->data_dir . '/postmaster.pid';
    usleep(100_000);
}

but it does not import usleep as other tests do (like this)
use Time::HiRes qw(usleep);

I think it simply hasn't been called because postgres exits and
terminates the loop before it has had a chance to be called. If you
reorder the sleep and the loop termination condition, it fails
immediately because it can't find usleep.

On an unrelated note, I also noticed that hot_standby_feedback is not
turned on during the test until we are explicitly checking for
recovery conflicts. When it is enabled, the comment says "Turn
hot_standby_feedback back on" -- but it was never on to begin with. I
suspect this doesn't produce test instability because autovacuum is
turned off for the whole test. But, I was wondering if the test
authors meant for hot_standby_feedback to be enabled during the
initial phases of the test (until it is explicitly disabled for
testing purposes).

- Melanie



Re: Missing import in 035_standby_logical_decoding.pl

От
Melanie Plageman
Дата:
On Fri, Aug 1, 2025 at 3:04 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> 035_standby_logical_decoding.pl has this code

Another issue I see is that there are two incorrect tests for a log message

# message should not be issued
ok( !$node_standby->log_contains(
        "invalidating obsolete slot \"no_conflict_inactiveslot\"", $logstart),
    'inactiveslot slot invalidation is not logged with vacuum on conflict_test'
);

ok( !$node_standby->log_contains(
        "invalidating obsolete slot \"no_conflict_activeslot\"", $logstart),
    'activeslot slot invalidation is not logged with vacuum on conflict_test'
);

This will never fail because the message string is actually
"invalidating obsolete replication slot"

It is correctly specified elsewhere in the same test.
so, that should be fixed too.

- Melanie



Re: Missing import in 035_standby_logical_decoding.pl

От
Masahiko Sawada
Дата:
On Fri, Aug 1, 2025 at 12:04 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Hi,
>
> 035_standby_logical_decoding.pl has this code
>
> # wait for postgres to terminate
> foreach my $i (0 .. 10 * $PostgreSQL::Test::Utils::timeout_default)
> {
>     last if !-f $node_standby->data_dir . '/postmaster.pid';
>     usleep(100_000);
> }
>
> but it does not import usleep as other tests do (like this)
> use Time::HiRes qw(usleep);
>
> I think it simply hasn't been called because postgres exits and
> terminates the loop before it has had a chance to be called. If you
> reorder the sleep and the loop termination condition, it fails
> immediately because it can't find usleep.

Agreed, usleep should be imported.

>
> On an unrelated note, I also noticed that hot_standby_feedback is not
> turned on during the test until we are explicitly checking for
> recovery conflicts. When it is enabled, the comment says "Turn
> hot_standby_feedback back on" -- but it was never on to begin with. I
> suspect this doesn't produce test instability because autovacuum is
> turned off for the whole test. But, I was wondering if the test
> authors meant for hot_standby_feedback to be enabled during the
> initial phases of the test (until it is explicitly disabled for
> testing purposes).

While I agree that the first "Turn hot_standby_feedback back on"
message is confusing for the reason you argued, I have no idea whether
we should enable hot_standby_feedback also during the initial phases
of the test. It could be a typo? I'll wait for comments from the
authors.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Missing import in 035_standby_logical_decoding.pl

От
Masahiko Sawada
Дата:
On Fri, Aug 1, 2025 at 1:58 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Fri, Aug 1, 2025 at 3:04 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > 035_standby_logical_decoding.pl has this code
>
> Another issue I see is that there are two incorrect tests for a log message
>
> # message should not be issued
> ok( !$node_standby->log_contains(
>         "invalidating obsolete slot \"no_conflict_inactiveslot\"", $logstart),
>     'inactiveslot slot invalidation is not logged with vacuum on conflict_test'
> );
>
> ok( !$node_standby->log_contains(
>         "invalidating obsolete slot \"no_conflict_activeslot\"", $logstart),
>     'activeslot slot invalidation is not logged with vacuum on conflict_test'
> );
>
> This will never fail because the message string is actually
> "invalidating obsolete replication slot"
>
> It is correctly specified elsewhere in the same test.
> so, that should be fixed too.

+1

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Missing import in 035_standby_logical_decoding.pl

От
Michael Paquier
Дата:
On Fri, Aug 01, 2025 at 03:22:06PM -0700, Masahiko Sawada wrote:
> On Fri, Aug 1, 2025 at 1:58 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
>> Another issue I see is that there are two incorrect tests for a log message
>>
>> # message should not be issued
>> ok( !$node_standby->log_contains(
>>         "invalidating obsolete slot \"no_conflict_inactiveslot\"", $logstart),
>>     'inactiveslot slot invalidation is not logged with vacuum on conflict_test'
>> );
>>
>> ok( !$node_standby->log_contains(
>>         "invalidating obsolete slot \"no_conflict_activeslot\"", $logstart),
>>     'activeslot slot invalidation is not logged with vacuum on conflict_test'
>> );
>>
>> This will never fail because the message string is actually
>> "invalidating obsolete replication slot"
>>
>> It is correctly specified elsewhere in the same test.
>> so, that should be fixed too.
>
> +1

Not the author of these two ones, but am I the only one who does not
see the point of these two tests?  These check that something does not
happen, but log_contains() is useful to make sure that an event
*actually happens*.  Wouldn't it be simpler to remove these checks or
switch them to use/add a log message that's triggered with a state
that we'd expect to happen in reality?
--
Michael

Вложения

Re: Missing import in 035_standby_logical_decoding.pl

От
Bertrand Drouvot
Дата:
Hi,

On Fri, Aug 01, 2025 at 03:04:20PM -0400, Melanie Plageman wrote:
> Hi,
> 
> 035_standby_logical_decoding.pl has this code
> 
> # wait for postgres to terminate
> foreach my $i (0 .. 10 * $PostgreSQL::Test::Utils::timeout_default)
> {
>     last if !-f $node_standby->data_dir . '/postmaster.pid';
>     usleep(100_000);
> }
> 
> but it does not import usleep as other tests do (like this)
> use Time::HiRes qw(usleep);
> 
> I think it simply hasn't been called because postgres exits and
> terminates the loop before it has had a chance to be called. If you
> reorder the sleep and the loop termination condition, it fails
> immediately because it can't find usleep.

Thanks for the report! I think it's an oversight in commit 48796a98d5ae. Fixed
in the attached.

> On an unrelated note, I also noticed that hot_standby_feedback is not
> turned on during the test until we are explicitly checking for
> recovery conflicts.

Yeah, it's expected.

> When it is enabled, the comment says "Turn
> hot_standby_feedback back on" -- but it was never on to begin with. I
> suspect this doesn't produce test instability because autovacuum is
> turned off for the whole test. But, I was wondering if the test
> authors meant for hot_standby_feedback to be enabled during the
> initial phases of the test (until it is explicitly disabled for
> testing purposes).

It's typo/bad wording fixed in the attached. hot_standby_feedback does not need
to be enabled during the initial phases of the test because there is no catalog
changes that could produce a conflict.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

Re: Missing import in 035_standby_logical_decoding.pl

От
Bertrand Drouvot
Дата:
Hi,

On Fri, Aug 01, 2025 at 04:57:50PM -0400, Melanie Plageman wrote:
> On Fri, Aug 1, 2025 at 3:04 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > 035_standby_logical_decoding.pl has this code
> 
> Another issue I see is that there are two incorrect tests for a log message
> 
> # message should not be issued
> ok( !$node_standby->log_contains(
>         "invalidating obsolete slot \"no_conflict_inactiveslot\"", $logstart),
>     'inactiveslot slot invalidation is not logged with vacuum on conflict_test'
> );
> 
> ok( !$node_standby->log_contains(
>         "invalidating obsolete slot \"no_conflict_activeslot\"", $logstart),
>     'activeslot slot invalidation is not logged with vacuum on conflict_test'
> );
> 
> This will never fail because the message string is actually
> "invalidating obsolete replication slot"
> 
> It is correctly specified elsewhere in the same test.
> so, that should be fixed too.

Thanks for the report! Indeed, fixed in 0001 just shared up-thread.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Missing import in 035_standby_logical_decoding.pl

От
Bertrand Drouvot
Дата:
Hi,

On Sat, Aug 02, 2025 at 08:09:14AM +0900, Michael Paquier wrote:
> On Fri, Aug 01, 2025 at 03:22:06PM -0700, Masahiko Sawada wrote:
> > On Fri, Aug 1, 2025 at 1:58 PM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> >> Another issue I see is that there are two incorrect tests for a log message
> >>
> >> # message should not be issued
> >> ok( !$node_standby->log_contains(
> >>         "invalidating obsolete slot \"no_conflict_inactiveslot\"", $logstart),
> >>     'inactiveslot slot invalidation is not logged with vacuum on conflict_test'
> >> );
> >>
> >> ok( !$node_standby->log_contains(
> >>         "invalidating obsolete slot \"no_conflict_activeslot\"", $logstart),
> >>     'activeslot slot invalidation is not logged with vacuum on conflict_test'
> >> );
> >>
> >> This will never fail because the message string is actually
> >> "invalidating obsolete replication slot"
> >>
> >> It is correctly specified elsewhere in the same test.
> >> so, that should be fixed too.
> > 
> > +1
> 
> Not the author of these two ones, but am I the only one who does not
> see the point of these two tests?  These check that something does not
> happen,

Yeah, we want to check that no conflicts occur for non catalog table.

> but log_contains() is useful to make sure that an event
> *actually happens*.

Not sure why it could not be useful to check that an event does not happen (
019_replslot_limit.pl also uses it that way).

> Wouldn't it be simpler to remove these checks

I'm afraid that could lead to false positives or to the test taking longer as
waiting for a timeout to occur.

> or
> switch them to use/add a log message that's triggered with a state
> that we'd expect to happen in reality?

If no conflicts occur, we do not emit messages in the log, then having nothing
to check. I think that's fine that way.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Missing import in 035_standby_logical_decoding.pl

От
Melanie Plageman
Дата:
On Mon, Aug 4, 2025 at 3:52 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Sat, Aug 02, 2025 at 08:09:14AM +0900, Michael Paquier wrote:
>
> > Not the author of these two ones, but am I the only one who does not
> > see the point of these two tests?  These check that something does not
> > happen,
> >
> > Wouldn't it be simpler to remove these checks
>
> I'm afraid that could lead to false positives or to the test taking longer as
> waiting for a timeout to occur.

This rationale is interesting. You are basically saying that you are
counting on one of the two log_contains() tests failing and never
actually reaching the pg_stat_database_conflicts query test in a
failure scenario.

When I was writing the recovery conflict (031_recovery_conflict.pl)
tests, I added both log message and stats checks because we didn't
have coverage of pg_stat_database_conflicts in our other stats tests
(that would have been hard to do in regress). The stats test was
expected to fail when something was wrong with the recovery conflict
stats updates -- not so much when something was wrong with the
recovery conflict handling logic itself. But that argument makes less
sense here in 035_standby_logical_decoding for the happy path test
case we are talking about. Though, I suppose someone could write code
that _incorrectly_ updated confl_active_logicalslot stats and this
would catch that.

Anyway, there's probably no reason to remove it (the
pg_stat_database_conflicts check for the happy path case) -- we could
come up with other, more impactful ways to get the test runtime down,
I imagine.

This test is not unprecedented in testing the absence of logfile
messages -- but it has the same problem those tests have in that they
don't fail nicely when the message changes or someone typo'd the
matching message string in the test.

- Melanie



Re: Missing import in 035_standby_logical_decoding.pl

От
Melanie Plageman
Дата:
On Mon, Aug 4, 2025 at 3:49 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> On Fri, Aug 01, 2025 at 03:04:20PM -0400, Melanie Plageman wrote:
>
> > but it does not import usleep as other tests do (like this)
> > use Time::HiRes qw(usleep);
> >
> > I think it simply hasn't been called because postgres exits and
> > terminates the loop before it has had a chance to be called. If you
> > reorder the sleep and the loop termination condition, it fails
> > immediately because it can't find usleep.
>
> Thanks for the report! I think it's an oversight in commit 48796a98d5ae. Fixed
> in the attached.

I've committed and backported the fix.

> > On an unrelated note, I also noticed that hot_standby_feedback is not
> > turned on during the test until we are explicitly checking for
> > recovery conflicts.
>
> Yeah, it's expected.

I see. I had to change it to be on for more of the test scenarios in
my patchset which sets the VM on-access [1] (v6-0017), but I suppose
it doesn't make sense to make this change on master.

- Melanie

[1] https://www.postgresql.org/message-id/CAAKRu_bGCgUuhmy1Mzkw3yCmbGcjNQAjV%3DOmjuW6hz90PuXKWA%40mail.gmail.com