Обсуждение: Making background psql nicer to use in tap tests
Hi, Plenty tap tests require a background psql. But they're pretty annoying to use. I think the biggest improvement would be an easy way to run a single query and get the result of that query. Manually having to pump_until() is awkward and often leads to hangs/timeouts, instead of test failures, because one needs to specify a match pattern to pump_until(), which on mismatch leads to trying to keep pumping forever. It's annoyingly hard to wait for the result of a query in a generic way with background_psql(), and more generally for psql. background_psql() uses -XAtq, which means that we'll not get "status" output (like "BEGIN" or "(1 row)"), and that queries not returning anything are completely invisible. A second annoyance is that issuing a query requires a trailing newline, otherwise psql won't process it. The best way I can see is to have a helper that issues the query, followed by a trailing newline, an \echo with a recognizable separator, and then uses pump_until() to wait for that separator. Another area worthy of improvement is that background_psql() requires passing in many variables externally - without a recognizable benefit afaict. What's the point in 'stdin', 'stdout', 'timer' being passed in? stdin/stdout need to point to empty strings, so we know what's needed - in fact we'll even reset them if they're passed in. The timer is always going to be PostgreSQL::Test::Utils::timeout_default, so again, what's the point? I think it'd be far more usable if we made background_psql() return a hash with the relevant variables. The 031_recovery_conflict.pl test has: my $psql_timeout = IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default); my %psql_standby = ('stdin' => '', 'stdout' => ''); $psql_standby{run} = $node_standby->background_psql($test_db, \$psql_standby{stdin}, \$psql_standby{stdout}, $psql_timeout); $psql_standby{stdout} = ''; How about just returning a reference to a hash like that? Except that I'd also make stderr available, which one can't currently access. The $psql_standby{stdout} = ''; is needed because background_psql() leaves a banner in the output, which it shouldn't, but we probably should just fix that. Brought to you by: Trying to write a test for vacuum_defer_cleanup_age. - Andres
Andres Freund <andres@anarazel.de> writes: > It's annoyingly hard to wait for the result of a query in a generic way with > background_psql(), and more generally for psql. background_psql() uses -XAtq, > which means that we'll not get "status" output (like "BEGIN" or "(1 row)"), > and that queries not returning anything are completely invisible. Yeah, the empty-query-result problem was giving me fits recently. +1 for wrapping this into something more convenient to use. regards, tom lane
Hi, On 2023-01-30 15:06:46 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > It's annoyingly hard to wait for the result of a query in a generic way with > > background_psql(), and more generally for psql. background_psql() uses -XAtq, > > which means that we'll not get "status" output (like "BEGIN" or "(1 row)"), > > and that queries not returning anything are completely invisible. > > Yeah, the empty-query-result problem was giving me fits recently. > +1 for wrapping this into something more convenient to use. I've hacked some on this. I first tried to just introduce a few helper functions in Cluster.pm, but that ended up being awkward. So I bit the bullet and introduced a new class (in BackgroundPsql.pm), and made background_psql() and interactive_psql() return an instance of it. This is just a rough prototype. Several function names don't seem great, it need POD documentation, etc. The main convenience things it has over the old interface: - $node->background_psql('dbname') is enough - $psql->query(), which returns the query results as a string, is a lot easier to use than having to pump, identify query boundaries via regex etc. - $psql->query_safe(), which dies if any query fails (detected via stderr) - $psql->query_until() is a helper that makes it a bit easier to start queries that won't finish until a later point I don't quite like the new interface yet: - It's somewhat common to want to know if there was a failure, but also get the query result, not sure what the best function signature for that is in perl. - query_until() sounds a bit too much like $node->poll_query_until(). Maybe query_wait_until() is better? OTOH, the other function has poll in the name, so maybe it's ok. - right now there's a bit too much logic in background_psql() / interactive_psql() for my taste Those points aside, I think it already makes the tests a good bit more readable. My WIP vacuum_defer_cleanup_age patch shrunk by half with it. I think with a bit more polish it's easy enough to use that we could avoid a good number of those one-off psql's that we do all over. I didn't really know what this, insrc/test/subscription/t/015_stream.pl, is about: $h->finish; # errors make the next test fail, so ignore them here There's no further test? I'm somewhat surprised it doesn't cause problems in another ->finish later on, where we then afterwards just use $h again. Apparently IPC::Run just automagically restarts psql? Greetings, Andres Freund
Вложения
> On 31 Jan 2023, at 01:00, Andres Freund <andres@anarazel.de> wrote: > I've hacked some on this. I first tried to just introduce a few helper > functions in Cluster.pm, but that ended up being awkward. So I bit the bullet > and introduced a new class (in BackgroundPsql.pm), and made background_psql() > and interactive_psql() return an instance of it. Thanks for working on this! > This is just a rough prototype. Several function names don't seem great, it > need POD documentation, etc. It might be rough around the edges but I don't think it's too far off a state in which in can be committed, given that it's replacing something even rougher. With documentation and some polish I think we can iterate on it in the tree. I've played around a lot with it and it seems fairly robust. > I don't quite like the new interface yet: > - It's somewhat common to want to know if there was a failure, but also get > the query result, not sure what the best function signature for that is in > perl. What if query() returns a list with the return value last? The caller will get the return value when assigning a single var as the return, and can get both in those cases when it's interesting. That would make for reasonably readable code in most places? $ret_val = $h->query("SELECT 1;"); ($query_result, $ret_val) = $h->query("SELECT 1;"); Returning a hash seems like a worse option since it will complicate callsites which only want to know success/failure. > - query_until() sounds a bit too much like $node->poll_query_until(). Maybe > query_wait_until() is better? OTOH, the other function has poll in the name, > so maybe it's ok. query_until isn't great but query_wait_until is IMO worse since the function may well be used for tests which aren't using longrunning waits. It's also very useful for things which aren't queries at all, like psql backslash commands. I don't have any better ideas though, so +1 for sticking with query_until. > - right now there's a bit too much logic in background_psql() / > interactive_psql() for my taste Not sure what you mean, I don't think they're especially heavy on logic? > Those points aside, I think it already makes the tests a good bit more > readable. My WIP vacuum_defer_cleanup_age patch shrunk by half with it. The test for \password in the SCRAM iteration count patch shrunk to 1/3 of the previous coding. > I think with a bit more polish it's easy enough to use that we could avoid a > good number of those one-off psql's that we do all over. Agreed, and ideally implement tests which were left unwritten due to the old API being clunky. + # feed the query to psql's stdin, follwed by \n (so psql processes the s/follwed/followed/ +A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up, +which can modified later. This require a bit of knowledge about the internals which I think we should hide in this new API. How about providing a function for defining the timeout? Re timeouts: one thing I've done repeatedly is to use short timeouts and reset them per query, and that turns pretty ugly fast. I hacked up your patch to provide $h->reset_timer_before_query() which then injects a {timeout}->start before running each query without the caller having to do it. Not sure if I'm alone in doing that but if not I think it makes sense to add. -- Daniel Gustafsson
Hi, On 2023-03-14 21:24:32 +0100, Daniel Gustafsson wrote: > > On 31 Jan 2023, at 01:00, Andres Freund <andres@anarazel.de> wrote: > > > I've hacked some on this. I first tried to just introduce a few helper > > functions in Cluster.pm, but that ended up being awkward. So I bit the bullet > > and introduced a new class (in BackgroundPsql.pm), and made background_psql() > > and interactive_psql() return an instance of it. > > Thanks for working on this! Thanks for helping it move along :) > > This is just a rough prototype. Several function names don't seem great, it > > need POD documentation, etc. > > It might be rough around the edges but I don't think it's too far off a state > in which in can be committed, given that it's replacing something even rougher. > With documentation and some polish I think we can iterate on it in the tree. Cool. > > I don't quite like the new interface yet: > > - It's somewhat common to want to know if there was a failure, but also get > > the query result, not sure what the best function signature for that is in > > perl. > > What if query() returns a list with the return value last? The caller will get > the return value when assigning a single var as the return, and can get both in > those cases when it's interesting. That would make for reasonably readable > code in most places? > $ret_val = $h->query("SELECT 1;"); > ($query_result, $ret_val) = $h->query("SELECT 1;"); I hate perl. > Returning a hash seems like a worse option since it will complicate callsites > which only want to know success/failure. Yea. Perhaps it's worth having a separate function for this? ->query_rc() or such? > > - right now there's a bit too much logic in background_psql() / > > interactive_psql() for my taste > > Not sure what you mean, I don't think they're especially heavy on logic? -EMISSINGWORD on my part. A bit too much duplicated logic. > +A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up, > +which can modified later. > > This require a bit of knowledge about the internals which I think we should > hide in this new API. How about providing a function for defining the timeout? "definining" in the sense of accessing it? Or passing one in? > Re timeouts: one thing I've done repeatedly is to use short timeouts and reset > them per query, and that turns pretty ugly fast. I hacked up your patch to > provide $h->reset_timer_before_query() which then injects a {timeout}->start > before running each query without the caller having to do it. Not sure if I'm > alone in doing that but if not I think it makes sense to add. I don't quite understand the use case, but I don't mind it as a functionality. Greetings, Andres Freund
Hi, On 2023-01-30 15:06:46 -0500, Tom Lane wrote:Andres Freund <andres@anarazel.de> writes:It's annoyingly hard to wait for the result of a query in a generic way with background_psql(), and more generally for psql. background_psql() uses -XAtq, which means that we'll not get "status" output (like "BEGIN" or "(1 row)"), and that queries not returning anything are completely invisible.Yeah, the empty-query-result problem was giving me fits recently. +1 for wrapping this into something more convenient to use.I've hacked some on this. I first tried to just introduce a few helper functions in Cluster.pm, but that ended up being awkward. So I bit the bullet and introduced a new class (in BackgroundPsql.pm), and made background_psql() and interactive_psql() return an instance of it. This is just a rough prototype. Several function names don't seem great, it need POD documentation, etc.
Since this class is only intended to have instances created from Cluster, I would be inclined just to put it at the end of Cluster.pm instead of creating a new file. That makes it clearer that the new package is not standalone. We already have instances of that.
The first param of the constructor is a bit opaque. If it were going to be called from elsewhere I'd want something a bit more obvious, but I guess we can live with it here. An alternative might be multiple_constructors (e.g. new_background, new_interactive) which use a common private routine.
Don't have comments yet on the other things, will continue looking.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
> On 15 Mar 2023, at 02:03, Andres Freund <andres@anarazel.de> wrote: >> Returning a hash seems like a worse option since it will complicate callsites >> which only want to know success/failure. > > Yea. Perhaps it's worth having a separate function for this? ->query_rc() or such? If we are returning a hash then I agree it should be a separate function. Maybe Andrew has input on which is the most Perl way of doing this. >>> - right now there's a bit too much logic in background_psql() / >>> interactive_psql() for my taste >> >> Not sure what you mean, I don't think they're especially heavy on logic? > > -EMISSINGWORD on my part. A bit too much duplicated logic. That makes more sense, and I can kind of agree. I don't think it's too bad but I agree there is room for improvement. >> +A default timeout of $PostgreSQL::Test::Utils::timeout_default is set up, >> +which can modified later. >> >> This require a bit of knowledge about the internals which I think we should >> hide in this new API. How about providing a function for defining the timeout? > > "definining" in the sense of accessing it? Or passing one in? I meant passing one in. >> Re timeouts: one thing I've done repeatedly is to use short timeouts and reset >> them per query, and that turns pretty ugly fast. I hacked up your patch to >> provide $h->reset_timer_before_query() which then injects a {timeout}->start >> before running each query without the caller having to do it. Not sure if I'm >> alone in doing that but if not I think it makes sense to add. > > I don't quite understand the use case, but I don't mind it as a functionality. I've used it a lot when I want to run n command which each should finish quickly or not at all. So one time budget per command rather than having a longer timeout for a set of commands that comprise a test. It can be done already today by calling ->start but it doesn't exactly make the code cleaner. As mentioned off-list I did some small POD additions when reviewing, so I've attached them here in a v2 in the hopes that it might be helpful. I've also included the above POC for restarting the timeout per query to show what I meant. -- Daniel Gustafsson
Вложения
On 15 Mar 2023, at 02:03, Andres Freund <andres@anarazel.de> wrote:Returning a hash seems like a worse option since it will complicate callsites which only want to know success/failure.Yea. Perhaps it's worth having a separate function for this? ->query_rc() or such?If we are returning a hash then I agree it should be a separate function. Maybe Andrew has input on which is the most Perl way of doing this.
I think the perlish way is use the `wantarray` function. Perl knows if you're expecting a scalar return value or a list (which includes a hash).
return wantarray ? $retval : (list or hash);
A few more issues:
A common perl idiom is to start private routine names with an underscore. so I'd rename wait_connect to _wait_connect;
Why is $restart_before_query a package/class level value instead of an instance value? And why can we only ever set it to 1 but not back again? Maybe we don't want to, but it looks odd.
If we are going to keep this as a separate package, then we should put some code in the constructor to prevent it being called from elsewhere than the Cluster package. e.g.
# this constructor should only be called from PostgreSQL::Test::Cluster
my ($package, $file, $line) = caller;
die "Forbidden caller of constructor: package: $package, file: $file:$line"
unless $package eq 'PostgreSQL::Test::Cluster';
This should refer to the full class name:
+=item $node->background_psql($dbname, %params) => BackgroundPsql instance
Still reviewing ...
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
> On 17 Mar 2023, at 14:48, Andrew Dunstan <andrew@dunslane.net> wrote: > On 2023-03-17 Fr 05:48, Daniel Gustafsson wrote: >>> On 15 Mar 2023, at 02:03, Andres Freund <andres@anarazel.de> >>> wrote: >>> >>>> Returning a hash seems like a worse option since it will complicate callsites >>>> which only want to know success/failure. >>>> >>> Yea. Perhaps it's worth having a separate function for this? ->query_rc() or such? >>> >> If we are returning a hash then I agree it should be a separate function. >> Maybe Andrew has input on which is the most Perl way of doing this. > > I think the perlish way is use the `wantarray` function. Perl knows if you're expecting a scalar return value or a list(which includes a hash). > > return wantarray ? $retval : (list or hash); Aha, TIL. That seems like precisely what we want. > A common perl idiom is to start private routine names with an underscore. so I'd rename wait_connect to _wait_connect; There are quite a few routines documented as internal in Cluster.pm which don't start with an underscore. Should we change them as well? I'm happy to prepare a separate patch to address that if we want that. > Why is $restart_before_query a package/class level value instead of an instance value? And why can we only ever set itto 1 but not back again? Maybe we don't want to, but it looks odd. It was mostly a POC to show what I meant with the functionality. I think there should be a way to turn it off (set it to zero) even though I doubt it will be used much. > If we are going to keep this as a separate package, then we should put some code in the constructor to prevent it beingcalled from elsewhere than the Cluster package. e.g. > > # this constructor should only be called from PostgreSQL::Test::Cluster > my ($package, $file, $line) = caller; > > die "Forbidden caller of constructor: package: $package, file: $file:$line" > unless $package eq 'PostgreSQL::Test::Cluster'; I don't have strong feelings about where to place this, but Cluster.pm is already quite long so I see a small upside to keeping it separate to not make that worse. -- Daniel Gustafsson
A common perl idiom is to start private routine names with an underscore. so I'd rename wait_connect to _wait_connect;There are quite a few routines documented as internal in Cluster.pm which don't start with an underscore. Should we change them as well? I'm happy to prepare a separate patch to address that if we want that.
Possibly. There are two concerns. First, make sure that they really are private. Last time I looked I think I noticed at least one thing that was alleged to be private but was called from a TAP script. Second, unless we backpatch it there will be some drift between branches, which can make backpatching things a bit harder. But by all means prep a patch so we can see the scope of the issue.
Why is $restart_before_query a package/class level value instead of an instance value? And why can we only ever set it to 1 but not back again? Maybe we don't want to, but it looks odd.It was mostly a POC to show what I meant with the functionality. I think there should be a way to turn it off (set it to zero) even though I doubt it will be used much.
A common idiom is to have a composite getter/setter method for object properties something like this
sub settingname{my ($self, $arg) = @_;$self->{settingname} = $arg if defined $arg;return $self->{settingname};}
If we are going to keep this as a separate package, then we should put some code in the constructor to prevent it being called from elsewhere than the Cluster package. e.g. # this constructor should only be called from PostgreSQL::Test::Cluster my ($package, $file, $line) = caller; die "Forbidden caller of constructor: package: $package, file: $file:$line" unless $package eq 'PostgreSQL::Test::Cluster';I don't have strong feelings about where to place this, but Cluster.pm is already quite long so I see a small upside to keeping it separate to not make that worse.
Yeah, I can go along with that.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > On 2023-03-17 Fr 10:08, Daniel Gustafsson wrote: >>> Why is $restart_before_query a package/class level value instead of >>> an instance value? And why can we only ever set it to 1 but not back >>> again? Maybe we don't want to, but it looks odd. >> It was mostly a POC to show what I meant with the functionality. I think there >> should be a way to turn it off (set it to zero) even though I doubt it will be >> used much. > > > A common idiom is to have a composite getter/setter method for object > properties something like this > > > sub settingname > { > my ($self, $arg) = @_; > $self->{settingname} = $arg if defined $arg; > return $self->{settingname}; > } Or, if undef is a valid value: sub settingname { my $self = shift; $self->[settingname} = shift if @_; return $self->{settingname}; } - ilmari
Andrew Dunstan <andrew@dunslane.net> writes:On 2023-03-17 Fr 10:08, Daniel Gustafsson wrote:Why is $restart_before_query a package/class level value instead of an instance value? And why can we only ever set it to 1 but not back again? Maybe we don't want to, but it looks odd.It was mostly a POC to show what I meant with the functionality. I think there should be a way to turn it off (set it to zero) even though I doubt it will be used much.A common idiom is to have a composite getter/setter method for object properties something like this sub settingname { my ($self, $arg) = @_; $self->{settingname} = $arg if defined $arg; return $self->{settingname}; }Or, if undef is a valid value: sub settingname { my $self = shift; $self->[settingname} = shift if @_; return $self->{settingname}; }
Yes, I agree that's better (modulo the bracket typo)
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi, On 2023-03-17 12:25:14 -0400, Andrew Dunstan wrote: > On 2023-03-17 Fr 10:08, Daniel Gustafsson wrote: > > > If we are going to keep this as a separate package, then we should put some code in the constructor to prevent it beingcalled from elsewhere than the Cluster package. e.g. > > > > > > # this constructor should only be called from PostgreSQL::Test::Cluster > > > my ($package, $file, $line) = caller; > > > die "Forbidden caller of constructor: package: $package, file: $file:$line" > > > unless $package eq 'PostgreSQL::Test::Cluster'; > > I don't have strong feelings about where to place this, but Cluster.pm is > > already quite long so I see a small upside to keeping it separate to not make > > that worse. > > > > Yeah, I can go along with that. Cool - I'd prefer a separate file. I do find Cluster.pm somewhat unwieldy at this point, and I susect that we'll end up with additional helpers around BackgroundPsql. Greetings, Andres Freund
Hi, On 2023-03-17 12:25:14 -0400, Andrew Dunstan wrote:On 2023-03-17 Fr 10:08, Daniel Gustafsson wrote:If we are going to keep this as a separate package, then we should put some code in the constructor to prevent it being called from elsewhere than the Cluster package. e.g. # this constructor should only be called from PostgreSQL::Test::Cluster my ($package, $file, $line) = caller; die "Forbidden caller of constructor: package: $package, file: $file:$line" unless $package eq 'PostgreSQL::Test::Cluster';I don't have strong feelings about where to place this, but Cluster.pm is already quite long so I see a small upside to keeping it separate to not make that worse.Yeah, I can go along with that.Cool - I'd prefer a separate file. I do find Cluster.pm somewhat unwieldy at this point, and I susect that we'll end up with additional helpers around BackgroundPsql.
Yeah. BTW, a better test than the one above would be
$package->isa("PostgreSQL::Test::Cluster")
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
> On 18 Mar 2023, at 23:07, Andrew Dunstan <andrew@dunslane.net> wrote: > BTW, a better test than the one above would be > > $package->isa("PostgreSQL::Test::Cluster") Attached is a quick updated v3 of the patch which, to the best of my Perl abilities, tries to address the comments raised here. -- Daniel Gustafsson
Вложения
The attached v4 fixes some incorrect documentation (added by me in v3), and fixes that background_psql() didn't honor on_error_stop and extraparams passed by the user. I've also added a commit which implements the \password test from the SCRAM iteration count patchset as well as cleaned up a few IPC::Run includes from test scripts. -- Daniel Gustafsson
Вложения
> On 31 Mar 2023, at 22:33, Daniel Gustafsson <daniel@yesql.se> wrote: > > The attached v4 fixes some incorrect documentation (added by me in v3), and > fixes that background_psql() didn't honor on_error_stop and extraparams passed > by the user. I've also added a commit which implements the \password test from > the SCRAM iteration count patchset as well as cleaned up a few IPC::Run > includes from test scripts. And a v5 to fix a test failure in recovery tests. -- Daniel Gustafsson
Вложения
Hi, On 2023-04-02 22:24:16 +0200, Daniel Gustafsson wrote: > And a v5 to fix a test failure in recovery tests. Thanks for workin gon this! There's this XXX that I added: > @@ -57,11 +51,10 @@ sub test_streaming > COMMIT; > }); > > - $in .= q{ > - COMMIT; > - \q > - }; > - $h->finish; # errors make the next test fail, so ignore them here > + $h->query_safe('COMMIT'); > + $h->quit; > + # XXX: Not sure what this means > + # errors make the next test fail, so ignore them here > > $node_publisher->wait_for_catchup($appname); I still don't know what that comment is supposed to mean, unfortunately. Greetings, Andres Freund
> On 2 Apr 2023, at 23:37, Andres Freund <andres@anarazel.de> wrote: > There's this XXX that I added: > >> @@ -57,11 +51,10 @@ sub test_streaming >> COMMIT; >> }); >> >> - $in .= q{ >> - COMMIT; >> - \q >> - }; >> - $h->finish; # errors make the next test fail, so ignore them here >> + $h->query_safe('COMMIT'); >> + $h->quit; >> + # XXX: Not sure what this means >> + # errors make the next test fail, so ignore them here >> >> $node_publisher->wait_for_catchup($appname); > > I still don't know what that comment is supposed to mean, unfortunately. My reading of it is that it's ignoring any croak errors which IPC::Run might throw if ->finish() isn't able to reap the psql process which had the \q. I've added Amit who committed it in 216a784829c on cc: to see if he remembers the comment in question and can shed some light. Skimming the linked thread yields no immediate clues. -- Daniel Gustafsson
Unless there are objections I plan to get this in before the freeze, in order to have better interactive tests starting with 16. With a little bit of documentation polish I think it's ready. -- Daniel Gustafsson
> On 5 Apr 2023, at 23:44, Daniel Gustafsson <daniel@yesql.se> wrote: > > Unless there are objections I plan to get this in before the freeze, in order > to have better interactive tests starting with 16. With a little bit of > documentation polish I think it's ready. When looking at the CFBot failure on Linux and Windows (not on macOS) I noticed that it was down to the instance lacking IO::Pty. [19:59:12.609](1.606s) ok 1 - scram_iterations in server side ROLE Can't locate IO/Pty.pm in @INC (you may need to install the IO::Pty module) (@INC contains: /tmp/cirrus-ci-build/src/test/perl/tmp/cirrus-ci-build/src/test/authentication /etc/perl /usr/local/lib/i386-linux-gnu/perl/5.32.1/usr/local/share/perl/5.32.1 /usr/lib/i386-linux-gnu/perl5/5.32 /usr/share/perl5/usr/lib/i386-linux-gnu/perl/5.32 /usr/share/perl/5.32 /usr/local/lib/site_perl) at /usr/share/perl5/IPC/Run.pmline 1828. Skimming the VM creation [0] it seems like it should be though? On macOS the module is installed inside Cirrus and the test runs fine. I don't think we should go ahead with a patch that refactors interactive_psql only to SKIP over it in CI (which is what the tab_completion test does now), so let's wait until we have that sorted before going ahead. -- Daniel Gustafsson [0] https://github.com/anarazel/pg-vm-images/blob/main/scripts/linux_debian_install_deps.sh
On 5 Apr 2023, at 23:44, Daniel Gustafsson <daniel@yesql.se> wrote: Unless there are objections I plan to get this in before the freeze, in order to have better interactive tests starting with 16. With a little bit of documentation polish I think it's ready.When looking at the CFBot failure on Linux and Windows (not on macOS) I noticed that it was down to the instance lacking IO::Pty. [19:59:12.609](1.606s) ok 1 - scram_iterations in server side ROLE Can't locate IO/Pty.pm in @INC (you may need to install the IO::Pty module) (@INC contains: /tmp/cirrus-ci-build/src/test/perl /tmp/cirrus-ci-build/src/test/authentication /etc/perl /usr/local/lib/i386-linux-gnu/perl/5.32.1 /usr/local/share/perl/5.32.1 /usr/lib/i386-linux-gnu/perl5/5.32 /usr/share/perl5 /usr/lib/i386-linux-gnu/perl/5.32 /usr/share/perl/5.32 /usr/local/lib/site_perl) at /usr/share/perl5/IPC/Run.pm line 1828. Skimming the VM creation [0] it seems like it should be though? On macOS the module is installed inside Cirrus and the test runs fine. I don't think we should go ahead with a patch that refactors interactive_psql only to SKIP over it in CI (which is what the tab_completion test does now), so let's wait until we have that sorted before going ahead.
It should probably be added to config/check_modules.pl if we're going to use it, but it seems to be missing for Strawberry Perl and msys/ucrt64 perl and I'm not sure how easy it will be to add there. It would certainly add an installation burden for test instances at the very least.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Hi, On 2023-04-07 15:32:12 +0200, Daniel Gustafsson wrote: > > On 5 Apr 2023, at 23:44, Daniel Gustafsson <daniel@yesql.se> wrote: > > > > Unless there are objections I plan to get this in before the freeze, in order > > to have better interactive tests starting with 16. With a little bit of > > documentation polish I think it's ready. > > When looking at the CFBot failure on Linux and Windows (not on macOS) I noticed > that it was down to the instance lacking IO::Pty. > > [19:59:12.609](1.606s) ok 1 - scram_iterations in server side ROLE > Can't locate IO/Pty.pm in @INC (you may need to install the IO::Pty module) (@INC contains: /tmp/cirrus-ci-build/src/test/perl/tmp/cirrus-ci-build/src/test/authentication /etc/perl /usr/local/lib/i386-linux-gnu/perl/5.32.1/usr/local/share/perl/5.32.1 /usr/lib/i386-linux-gnu/perl5/5.32 /usr/share/perl5/usr/lib/i386-linux-gnu/perl/5.32 /usr/share/perl/5.32 /usr/local/lib/site_perl) at /usr/share/perl5/IPC/Run.pmline 1828. > > Skimming the VM creation [0] it seems like it should be though? Note it just fails on the 32build, not the 64bit build. Unfortunately I don't think debian's multiarch in bullseye support installing enough of perl in 32bit and 64bit. We can't have a hard dependency on non-default modules like IO::Pty anyway, so the test needs to skip if it's not available. On windows IO::Pty can't be installed, IIRC. > I don't think we should go ahead with a patch that refactors interactive_psql > only to SKIP over it in CI (which is what the tab_completion test does now), so > let's wait until we have that sorted before going ahead. Maybe I am a bit confused, but isn't that just an existing requirement? Why would we expect this patchset to change what dependencies use of interactive_psql() has? Greetings, Andres Freund
Hi, On 2023-04-07 10:55:19 -0400, Andrew Dunstan wrote: > It should probably be added to config/check_modules.pl if we're going to use > it, but it seems to be missing for Strawberry Perl and msys/ucrt64 perl and > I'm not sure how easy it will be to add there. It would certainly add an > installation burden for test instances at the very least. The last time I tried, it can't be installed on windows with cpan either, the module simply doesn't have the necessary windows bits - likely because traditionally windows didn't really have ptys. I think some stuff has been added, but it probably would still require a bunch of portability work. Note that we normally don't even build with readline support on windows - so there's not really much point in using IO::Pty there. While I've gotten that to work manually not too long ago, it's still manual and not documented etc. Afaict the failures are purely about patch 2, not 1, right? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2023-04-07 15:32:12 +0200, Daniel Gustafsson wrote: >> I don't think we should go ahead with a patch that refactors interactive_psql >> only to SKIP over it in CI (which is what the tab_completion test does now), so >> let's wait until we have that sorted before going ahead. > Maybe I am a bit confused, but isn't that just an existing requirement? Why > would we expect this patchset to change what dependencies use of > interactive_psql() has? It is an existing requirement, but only for a test that's not too critical. If interactive_psql starts getting used for more interesting things, we might be sad that the coverage is weak. Having said that, weak coverage is better than no coverage. I don't think this point should be a show-stopper for committing. regards, tom lane
> On 7 Apr 2023, at 16:58, Andres Freund <andres@anarazel.de> wrote: > Note it just fails on the 32build, not the 64bit build. Unfortunately I don't > think debian's multiarch in bullseye support installing enough of perl in > 32bit and 64bit. I should probably avoid parsing logfiles with fever-induced brainfog, I confused myself to think it was both =( > Maybe I am a bit confused, but isn't that just an existing requirement? Why > would we expect this patchset to change what dependencies use of > interactive_psql() has? Correct, there is no change from the current implementation. -- Daniel Gustafsson
Hi, On 2023-04-07 11:52:37 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2023-04-07 15:32:12 +0200, Daniel Gustafsson wrote: > >> I don't think we should go ahead with a patch that refactors interactive_psql > >> only to SKIP over it in CI (which is what the tab_completion test does now), so > >> let's wait until we have that sorted before going ahead. > > > Maybe I am a bit confused, but isn't that just an existing requirement? Why > > would we expect this patchset to change what dependencies use of > > interactive_psql() has? > > It is an existing requirement, but only for a test that's not too > critical. If interactive_psql starts getting used for more interesting > things, we might be sad that the coverage is weak. I don't really expect it to be used for non-critical things - after all, interactive_psql() also depends on psql being built with readline support, which we traditionally don't have on windows... For most tasks background_psql should suffice... > Having said that, weak coverage is better than no coverage. I don't > think this point should be a show-stopper for committing. Yea. One thing I wonder is whether we should have a central function for checking if interactive_psql() is available, instead of copying 010_tab_completion.pl's logic for it into multiple tests. But that could come later too. Greetings, Andres Freund
> On 7 Apr 2023, at 17:04, Andres Freund <andres@anarazel.de> wrote: > Afaict the failures are purely about patch 2, not 1, right? Correct. The attached v6 wraps the interactive_psql test in a SKIP block with a conditional on IO::Pty being available. -- Daniel Gustafsson
Вложения
> On 7 Apr 2023, at 18:14, Daniel Gustafsson <daniel@yesql.se> wrote: >> On 7 Apr 2023, at 17:04, Andres Freund <andres@anarazel.de> wrote: >> Afaict the failures are purely about patch 2, not 1, right? > > Correct. The attached v6 wraps the interactive_psql test in a SKIP block with > a conditional on IO::Pty being available. This version was green in the CFBot, so I ended up pushing it after some documentation fixups and polish. -- Daniel Gustafsson
> On 7 Apr 2023, at 22:24, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 7 Apr 2023, at 18:14, Daniel Gustafsson <daniel@yesql.se> wrote: >>> On 7 Apr 2023, at 17:04, Andres Freund <andres@anarazel.de> wrote: > >>> Afaict the failures are purely about patch 2, not 1, right? >> >> Correct. The attached v6 wraps the interactive_psql test in a SKIP block with >> a conditional on IO::Pty being available. > > This version was green in the CFBot, so I ended up pushing it after some > documentation fixups and polish. Looks like morepork wasn't happy with the interactive \password test. https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=morepork&dt=2023-04-07%2020%3A30%3A29 Looking into why the timer timed out in that test. -- Daniel Gustafsson
> On 7 Apr 2023, at 23:01, Daniel Gustafsson <daniel@yesql.se> wrote: > Looks like morepork wasn't happy with the interactive \password test. > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=morepork&dt=2023-04-07%2020%3A30%3A29 > > Looking into why the timer timed out in that test. Staring at this I've been unable to figure out if there an underlying problem here or a flaky testrun, since I can't reproduce it. Maybe the animal owner (on cc) have any insights? The test has passed on several different platforms in the buildfarm, including Linux, Solaris, macOS, NetBSD, FreeBSD and other OpenBSD animals. It also passed in an OpenBSD VM running with our Cirrus framework. Unless there are objections raised I propose leaving it in for now, and I will return to it tomorrow after some sleep, and install OpenBSD 6.9 to see if it's reproducible. -- Daniel Gustafsson
Daniel Gustafsson <daniel@yesql.se> writes: >> On 7 Apr 2023, at 23:01, Daniel Gustafsson <daniel@yesql.se> wrote: > Staring at this I've been unable to figure out if there an underlying problem > here or a flaky testrun, since I can't reproduce it. Maybe the animal owner > (on cc) have any insights? > The test has passed on several different platforms in the buildfarm, including > Linux, Solaris, macOS, NetBSD, FreeBSD and other OpenBSD animals. It also > passed in an OpenBSD VM running with our Cirrus framework. prion and mantid have now failed with the same symptom. I don't see a pattern, but it's not OpenBSD-only. It will be interesting to see if the failure is intermittent or not on those animals. > Unless there are objections raised I propose leaving it in for now, and I will > return to it tomorrow after some sleep, and install OpenBSD 6.9 to see if it's > reproducible. Agreed, we don't need a hasty revert here. Better to gather data. regards, tom lane
> On 8 Apr 2023, at 00:35, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Daniel Gustafsson <daniel@yesql.se> writes: >>> On 7 Apr 2023, at 23:01, Daniel Gustafsson <daniel@yesql.se> wrote: >> Staring at this I've been unable to figure out if there an underlying problem >> here or a flaky testrun, since I can't reproduce it. Maybe the animal owner >> (on cc) have any insights? > >> The test has passed on several different platforms in the buildfarm, including >> Linux, Solaris, macOS, NetBSD, FreeBSD and other OpenBSD animals. It also >> passed in an OpenBSD VM running with our Cirrus framework. > > prion and mantid have now failed with the same symptom. I don't > see a pattern, but it's not OpenBSD-only. It will be interesting > to see if the failure is intermittent or not on those animals. It would be interesting to know how far in the pumped input they get, if they time out on the first one with nothing going through? Will investigate further tomorrow to see. -- Daniel Gustafsson
> On 8 Apr 2023, at 00:59, Daniel Gustafsson <daniel@yesql.se> wrote: > >> On 8 Apr 2023, at 00:35, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> Daniel Gustafsson <daniel@yesql.se> writes: >>>> On 7 Apr 2023, at 23:01, Daniel Gustafsson <daniel@yesql.se> wrote: >>> Staring at this I've been unable to figure out if there an underlying problem >>> here or a flaky testrun, since I can't reproduce it. Maybe the animal owner >>> (on cc) have any insights? >> >>> The test has passed on several different platforms in the buildfarm, including >>> Linux, Solaris, macOS, NetBSD, FreeBSD and other OpenBSD animals. It also >>> passed in an OpenBSD VM running with our Cirrus framework. >> >> prion and mantid have now failed with the same symptom. I don't >> see a pattern, but it's not OpenBSD-only. It will be interesting >> to see if the failure is intermittent or not on those animals. morepork has failed again, which is good, since intermittent failures are harder to track down. > It would be interesting to know how far in the pumped input they get, if they > time out on the first one with nothing going through? Will investigate further > tomorrow to see. Actually, one quick datapoint. prion and mantid report running IPC::Run version 0.92, and morepork 0.96. Animals that pass are running 20180523.0, 20200505.0, 20220807.0 or similar versions. We don't print the IO::Pty version during configure, but maybe this is related to older versions of the modules and this test (not all of them apparently) need to SKIP if IO::Pty is missing or too old? Somewhere to start looking at the very least. -- Daniel Gustafsson
On 8 Apr 2023, at 00:59, Daniel Gustafsson <daniel@yesql.se> wrote:On 8 Apr 2023, at 00:35, Tom Lane <tgl@sss.pgh.pa.us> wrote: Daniel Gustafsson <daniel@yesql.se> writes:On 7 Apr 2023, at 23:01, Daniel Gustafsson <daniel@yesql.se> wrote:Staring at this I've been unable to figure out if there an underlying problem here or a flaky testrun, since I can't reproduce it. Maybe the animal owner (on cc) have any insights?The test has passed on several different platforms in the buildfarm, including Linux, Solaris, macOS, NetBSD, FreeBSD and other OpenBSD animals. It also passed in an OpenBSD VM running with our Cirrus framework.prion and mantid have now failed with the same symptom. I don't see a pattern, but it's not OpenBSD-only. It will be interesting to see if the failure is intermittent or not on those animals.morepork has failed again, which is good, since intermittent failures are harder to track down.It would be interesting to know how far in the pumped input they get, if they time out on the first one with nothing going through? Will investigate further tomorrow to see.Actually, one quick datapoint. prion and mantid report running IPC::Run version 0.92, and morepork 0.96. Animals that pass are running 20180523.0, 20200505.0, 20220807.0 or similar versions. We don't print the IO::Pty version during configure, but maybe this is related to older versions of the modules and this test (not all of them apparently) need to SKIP if IO::Pty is missing or too old? Somewhere to start looking at the very least.
Those aren't CPAN version numbers. See <https://metacpan.org/pod/IO::Pty>
prion was running 1.10 (dated to 2010). I have just updated it to 1.17 (the CPAN latest). We'll see if that makes a difference.
cheers
andrew
-- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: >> Actually, one quick datapoint. prion and mantid report running IPC::Run >> version 0.92, and morepork 0.96. Animals that pass are running 20180523.0, >> 20200505.0, 20220807.0 or similar versions. We don't print the IO::Pty version >> during configure, but maybe this is related to older versions of the modules >> and this test (not all of them apparently) need to SKIP if IO::Pty is missing >> or too old? Somewhere to start looking at the very least. > prion was running 1.10 (dated to 2010). I have just updated it to 1.17 > (the CPAN latest). We'll see if that makes a difference. I've been doing some checking with perlbrew locally. It appears to not be about IO::Pty so much as IPC::Run: it works with IPC::Run 0.99 but not 0.79. Still bisecting to identify exactly what's the minimum okay version. regards, tom lane
I wrote: > I've been doing some checking with perlbrew locally. It appears to not > be about IO::Pty so much as IPC::Run: it works with IPC::Run 0.99 but > not 0.79. Still bisecting to identify exactly what's the minimum > okay version. The answer is: it works with IPC::Run >= 0.98. The version of IO::Pty doesn't appear significant; it works at least back to 1.00 from early 2002. IPC::Run 0.98 is relatively new (2018), so I don't think it'd fly to make that our new minimum version across-the-board. I recommend just setting up this one test to SKIP if IPC::Run is too old. regards, tom lane
Hi, On 2023-04-07 20:49:39 -0400, Tom Lane wrote: > I wrote: > > I've been doing some checking with perlbrew locally. It appears to not > > be about IO::Pty so much as IPC::Run: it works with IPC::Run 0.99 but > > not 0.79. Still bisecting to identify exactly what's the minimum > > okay version. > > The answer is: it works with IPC::Run >= 0.98. The version of IO::Pty > doesn't appear significant; it works at least back to 1.00 from early > 2002. > > IPC::Run 0.98 is relatively new (2018), so I don't think it'd fly > to make that our new minimum version across-the-board. I recommend > just setting up this one test to SKIP if IPC::Run is too old. Does the test actually take a while before it fails, or is it quick? It's possible the failure is caused by 001_password.pl's use of set_query_timer_restart(). I don't think other tests do something quite comparable. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2023-04-07 20:49:39 -0400, Tom Lane wrote: >> IPC::Run 0.98 is relatively new (2018), so I don't think it'd fly >> to make that our new minimum version across-the-board. I recommend >> just setting up this one test to SKIP if IPC::Run is too old. > Does the test actually take a while before it fails, or is it quick? It times out at whatever your PG_TEST_TIMEOUT_DEFAULT is. I waited 3 minutes the first time, and then reduced that to 20sec for the rest of the tries ... regards, tom lane
> On 8 Apr 2023, at 02:49, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I wrote: >> I've been doing some checking with perlbrew locally. It appears to not >> be about IO::Pty so much as IPC::Run: it works with IPC::Run 0.99 but >> not 0.79. Still bisecting to identify exactly what's the minimum >> okay version. > > The answer is: it works with IPC::Run >= 0.98. The version of IO::Pty > doesn't appear significant; it works at least back to 1.00 from early > 2002. Thanks for investigating this! > IPC::Run 0.98 is relatively new (2018), so I don't think it'd fly > to make that our new minimum version across-the-board. Absolutely, that's not an option. > I recommend > just setting up this one test to SKIP if IPC::Run is too old. Yes, will do that when I have a little more time at hand for monitoring the BF later today. -- Daniel Gustafsson
On 2023-04-08 09:53, Daniel Gustafsson wrote: >> I recommend >> just setting up this one test to SKIP if IPC::Run is too old. > > Yes, will do that when I have a little more time at hand for monitoring the BF > later today. So what do you want me to do about grison and morepork? I guess I could try to install a newer version of IPC::Run from CPAN or should I just leave it be? /Mikael
=?UTF-8?Q?Mikael_Kjellstr=c3=b6m?= <mikael.kjellstrom@gmail.com> writes: > So what do you want me to do about grison and morepork? I guess I could > try to install a newer version of IPC::Run from CPAN or should I just > leave it be? I think "leave it be" is fine. regards, tom lane