Обсуждение: Re: issue with meson builds on msys2

Поиск
Список
Период
Сортировка

Re: issue with meson builds on msys2

От
Andres Freund
Дата:
Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:
> Still running into this, and I am rather stumped. This is a blocker for
> buildfarm support for meson:
> 
> Here's a simple illustration of the problem. If I do the identical test with
> a non-meson build there is no problem:

This happens 100% reproducible?


> pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
> $ export PGCTLTIMEOUT=300
> 
> pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
> $ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41";
> system("bin/pg_ctl -D data-C -l logfile start") ; print "fail\n" if $?; '
> waiting for server to start.... done
> server started

Does it happen as well if you use ucrt perl? Not because I think we should
require it, just to narrow the space.

Any chance that doing export MSYS=winjitdebug changes something? There's quite
a bit of similarity with the python issue you've also encountered - python
would just exit with the a failure indicating exit code.


> pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf
> $ /usr/bin/perl -e 'chdir "root/HEAD/instkeep.2023-04-25_11-09-41";
> system("bin/pg_ctl -D data-C -l logfile stop") ; print "fail\n" if $?; '
> waiting for server to shut down....fail

Hm. I don't remember the details, but in the python case I was able to get
some additional error code somehow, which then indicated that the
child-process failed with the NT status code indicating the equivalent of a
segfault.

I guess system() in msys perl will invoke bash as a shell to execute the
problem. Perhaps the failing program isn't actually pg_ctl, but the shell? If
it is indeed bash, what does the shell report as the exit code of pg_ctl?
E.g. doing something like
  system('bin/pg_ctl -D data-C -l logfile stop; echo $?');


Could you do ldd (with mingw's ldd, which understands PE binaries) of meson
and autoconf built pg_ctl on your machine? I wonder if we end up with a
different windows runtime or such.  In the python case I had some
circumstantial evidence that the problem was dependent on the windows runtime
version.

Downthread you mention that the issue doesn't happen with IPC::Run - the
biggest difference I can see is that IPC::Run would IIRC not use a shell? Does
the problem "re-appear" if you make IPC::Run use a shell?

Greetings,

Andres Freund



Re: issue with meson builds on msys2

От
Andrew Dunstan
Дата:


On 2023-04-27 Th 18:18, Andres Freund wrote:
Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:
Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:
This happens 100% reproducible?


For a sufficiently modern installation of msys2 (20230318 version) this is reproducible on autoconf builds as well.

For now it's off my list of meson blockers. I will pursue the issue when I have time, but for now the IPC::Run workaround is sufficient.

The main thing that's now an issue on Windows is support for various options like libxml2. I installed the libxml2 distro from the package manager scoop, generated .lib files for the libxml2 and libxslt DLLs, and was able to build with autoconf on msys2, and with our MSVC support, but not with meson in either case. It looks like we need to expand the logic in meson.build for a number of these, just as we have done for perl, python, openssl, ldap etc.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: issue with meson builds on msys2

От
Andres Freund
Дата:
Hi,

On 2023-05-03 09:20:28 -0400, Andrew Dunstan wrote:
> On 2023-04-27 Th 18:18, Andres Freund wrote:
> > Hi,
> > 
> > On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:
> > > Still running into this, and I am rather stumped. This is a blocker for
> > > buildfarm support for meson:
> > > 
> > > Here's a simple illustration of the problem. If I do the identical test with
> > > a non-meson build there is no problem:
> > This happens 100% reproducible?

> For a sufficiently modern installation of msys2 (20230318 version) this is
> reproducible on autoconf builds as well.

Oh. Seems like something we need to dig into independent of meson then :(


> The main thing that's now an issue on Windows is support for various options
> like libxml2. I installed the libxml2 distro from the package manager scoop,
> generated .lib files for the libxml2 and libxslt DLLs, and was able to build
> with autoconf on msys2, and with our MSVC support, but not with meson in
> either case. It looks like we need to expand the logic in meson.build for a
> number of these, just as we have done for perl, python, openssl, ldap etc.

I seriously doubt that trying to support every possible packaging thing on
windows is a good idea. What's the point of building against libraries from a
packaging solution that doesn't even come with .lib files? Windows already is
a massive pain to support for postgres, making it even more complicated / less
predictable is a really bad idea.

IMO, for windows, the path we should go down is to provide one documented way
to build the dependencies (e.g. using vcpkg or conan, perhaps also supporting
msys distributed libs), and define using something else to be unsupported (in
the "we don't help you", not in the "we explicitly try to break things"
sense).  And it should be something that understands needing to build debug
and non-debug libraries.

Greetings,

Andres Freund



Re: issue with meson builds on msys2

От
Andrew Dunstan
Дата:


On 2023-05-03 We 09:20, Andrew Dunstan wrote:


On 2023-04-27 Th 18:18, Andres Freund wrote:
Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:
Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:
This happens 100% reproducible?


For a sufficiently modern installation of msys2 (20230318 version) this is reproducible on autoconf builds as well.

For now it's off my list of meson blockers. I will pursue the issue when I have time, but for now the IPC::Run workaround is sufficient.

The main thing that's now an issue on Windows is support for various options like libxml2. I installed the libxml2 distro from the package manager scoop, generated .lib files for the libxml2 and libxslt DLLs, and was able to build with autoconf on msys2, and with our MSVC support, but not with meson in either case. It looks like we need to expand the logic in meson.build for a number of these, just as we have done for perl, python, openssl, ldap etc.




I've actually made some progress on this front. I grabbed and built https://github.com/pkgconf/pkgconf.git (with meson :-) )

After that I set PKG_CONFIG_PATH to point to where the libxml .pc files are installed, and lo and behold the meson/msvc build worked with libxml / libxslt. I did have to move libxml's openssl.pc file aside, as the distro's version of openssl is extremely old, and we don't want to use it (I'm using 3.1.0).

Of course, this imposes an extra build dependency for Windows, but it's not too onerous.

It also means that if anyone wants to use some dependency without a .pc file they would need to create one. I'll keep trying to expand the list of things I configure with.

Next targets will include ldap, lz4 and zstd.

I also need to test this with msys2, so fat I have only tested with MSVC.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: issue with meson builds on msys2

От
Andrew Dunstan
Дата:


On 2023-05-03 We 14:26, Andres Freund wrote:
Hi,

On 2023-05-03 09:20:28 -0400, Andrew Dunstan wrote:
On 2023-04-27 Th 18:18, Andres Freund wrote:
Hi,

On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:
Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:
This happens 100% reproducible?
For a sufficiently modern installation of msys2 (20230318 version) this is
reproducible on autoconf builds as well.
Oh. Seems like something we need to dig into independent of meson then :(


The main thing that's now an issue on Windows is support for various options
like libxml2. I installed the libxml2 distro from the package manager scoop,
generated .lib files for the libxml2 and libxslt DLLs, and was able to build
with autoconf on msys2, and with our MSVC support, but not with meson in
either case. It looks like we need to expand the logic in meson.build for a
number of these, just as we have done for perl, python, openssl, ldap etc.
I seriously doubt that trying to support every possible packaging thing on
windows is a good idea. What's the point of building against libraries from a
packaging solution that doesn't even come with .lib files? Windows already is
a massive pain to support for postgres, making it even more complicated / less
predictable is a really bad idea.

IMO, for windows, the path we should go down is to provide one documented way
to build the dependencies (e.g. using vcpkg or conan, perhaps also supporting
msys distributed libs), and define using something else to be unsupported (in
the "we don't help you", not in the "we explicitly try to break things"
sense).  And it should be something that understands needing to build debug
and non-debug libraries.


I'm not familiar with conan. I have struggled considerably with vcpkg in the past.

I don't think there is any one perfect answer.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: issue with meson builds on msys2

От
Andres Freund
Дата:
Hi,

On 2023-05-03 09:20:28 -0400, Andrew Dunstan wrote:
> On 2023-04-27 Th 18:18, Andres Freund wrote:
> > On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:
> > > Still running into this, and I am rather stumped. This is a blocker for
> > > buildfarm support for meson:
> > >
> > > Here's a simple illustration of the problem. If I do the identical test with
> > > a non-meson build there is no problem:
> > This happens 100% reproducible?
>
> For a sufficiently modern installation of msys2 (20230318 version) this is
> reproducible on autoconf builds as well.
>
> For now it's off my list of meson blockers. I will pursue the issue when I
> have time, but for now the IPC::Run workaround is sufficient.

Hm. I can't reproduce this in my test win10 VM, unfortunately. What OS / OS
version is the host? Any chance to get systeminfo.exe output or something like
that?

I think we ought to do something here. If newer environments cause failures
like this, it seems likely that this will spread to more and more applications
over time...

Greetings,

Andres Freund



Re: issue with meson builds on msys2

От
Andrew Dunstan
Дата:


On 2023-05-04 Th 19:54, Andres Freund wrote:
Hi,

On 2023-05-03 09:20:28 -0400, Andrew Dunstan wrote:
On 2023-04-27 Th 18:18, Andres Freund wrote:
On 2023-04-26 09:59:05 -0400, Andrew Dunstan wrote:
Still running into this, and I am rather stumped. This is a blocker for
buildfarm support for meson:

Here's a simple illustration of the problem. If I do the identical test with
a non-meson build there is no problem:
This happens 100% reproducible?
For a sufficiently modern installation of msys2 (20230318 version) this is
reproducible on autoconf builds as well.

For now it's off my list of meson blockers. I will pursue the issue when I
have time, but for now the IPC::Run workaround is sufficient.
Hm. I can't reproduce this in my test win10 VM, unfortunately. What OS / OS
version is the host? Any chance to get systeminfo.exe output or something like
that?


Its a Windows Server 2019 (v 1809) instance running on AWS.


Here's an extract from systeminfo:


OS Name:                   Microsoft Windows Server 2019 Datacenter
OS Version:                10.0.17763 N/A Build 17763
OS Manufacturer:           Microsoft Corporation
OS Configuration:          Standalone Server
OS Build Type:             Multiprocessor Free
Registered Owner:          EC2
Registered Organization:   Amazon.com
Product ID:                00430-00000-00000-AA796
Original Install Date:     4/24/2023, 10:28:31 AM
System Boot Time:          4/24/2023, 1:49:59 PM
System Manufacturer:       Amazon EC2
System Model:              t3.large
System Type:               x64-based PC
Processor(s):              1 Processor(s) Installed.
                           [01]: Intel64 Family 6 Model 85 Stepping 7 GenuineIntel ~2500 Mhz
BIOS Version:              Amazon EC2 1.0, 10/16/2017
Windows Directory:         C:\Windows
System Directory:          C:\Windows\system32
Boot Device:               \Device\HarddiskVolume1
System Locale:             en-us;English (United States)
Input Locale:              en-us;English (United States)
Time Zone:                 (UTC) Coordinated Universal Time
Total Physical Memory:     8,090 MB
Available Physical Memory: 4,843 MB
Virtual Memory: Max Size:  10,010 MB
Virtual Memory: Available: 7,405 MB
Virtual Memory: In Use:    2,605 MB



I think we ought to do something here. If newer environments cause failures
like this, it seems likely that this will spread to more and more applications
over time...


Just to reassure myself I have not been hallucinating, I repeated the test.


pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf/root/HEAD/inst
$ /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ; print $? ? "BANG: $?\n" : "OK\n";'
OK

pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf/root/HEAD/inst
$ /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop > stoplog 2>&1}) ; print $? ? "BANG: $?\n" : "OK\n";'
BANG: 33280


If you want to play I can arrange access.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: issue with meson builds on msys2

От
Andres Freund
Дата:
Hi,

On 2023-05-05 07:08:39 -0400, Andrew Dunstan wrote:
> On 2023-05-04 Th 19:54, Andres Freund wrote:
> > Hm. I can't reproduce this in my test win10 VM, unfortunately. What OS / OS
> > version is the host? Any chance to get systeminfo.exe output or something like
> > that?
> 
> 
> Its a Windows Server 2019 (v 1809) instance running on AWS.

Hm. When I hit the python issue I also couldn't repro it on windows 10. Cirrus
was also using Windows Server 2019...


> > I think we ought to do something here. If newer environments cause failures
> > like this, it seems likely that this will spread to more and more applications
> > over time...
> > 
> 
> Just to reassure myself I have not been hallucinating, I repeated the test.
> 
> 
> pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf/root/HEAD/inst
> $ /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start >
> startlog 2>&1}) ; print $? ? "BANG: $?\n" : "OK\n";'
> OK
> 
> pgrunner@EC2AMAZ-GCB871B UCRT64 ~/bf/root/HEAD/inst
> $ /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop >
> stoplog 2>&1}) ; print $? ? "BANG: $?\n" : "OK\n";'
> BANG: 33280

Oh, so it only happens when stopping, never when starting? That's
interesting...


> If you want to play I can arrange access.

That'd be very helpful.

Greetings,

Andres Freund



Re: issue with meson builds on msys2

От
Andres Freund
Дата:
Hi,

On 2023-05-05 07:08:39 -0400, Andrew Dunstan wrote:
> If you want to play I can arrange access.

Andrew did - thanks!


A first observeration is that making the shell command slightly more
complicated, by echoing $? after pg_ctl, prevents the error:

/usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D
data-C-w -l logfile stop > stoplog 2>&1;}) ; print $? ? "BANG: $?\n" : "OK\n";'
 
BANG: 33280

/usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D
data-C-w -l logfile stop > stoplog 2>&1; echo $?}) ; print $? ? "BANG: $?\n" : "OK\n";'
 
0
OK

So does manually or or via a subshell adding another layer of shell.


As Andrew observed earlier, the issue does not occur when not performing
redirection of the output. One interesting bit there is that the perl docs for
system include:
https://perldoc.perl.org/functions/system

> If there are no shell metacharacters in the argument, it is split into words
> and passed directly to execvp, which is more efficient. On Windows, only the
> system PROGRAM LIST syntax will reliably avoid using the shell; system LIST,
> even with more than one element, will fall back to the shell if the first
> spawn fails.

My guesss is that the issue somehow is triggered around the shell handling.


One relevant bit: If I use strace (from msys) within system, the subprograms
(shell and pg_ctl) actually exit with 0, from what I can tell - but 33280
still is returned. Unfortunately, if I use strace for all of perl, the error
vanishes.


Perhaps are some odd interactions with the stuff that InheritstdHandles()
does?

Andrew, is it ok if modify pg_ctl.c and rebuild? I don't know how "detached"
from the actual buildfarm animal the system you gave me access to is...

Greetings,

Andres Freund



Re: issue with meson builds on msys2

От
Andrew Dunstan
Дата:


On 2023-05-15 Mo 15:38, Andres Freund wrote:
Hi,

On 2023-05-05 07:08:39 -0400, Andrew Dunstan wrote:
If you want to play I can arrange access.
Andrew did - thanks!


A first observeration is that making the shell command slightly more
complicated, by echoing $? after pg_ctl, prevents the error:

/usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop > stoplog 2>&1;}) ; print $? ? "BANG: $?\n" : "OK\n";'
BANG: 33280

/usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D data-C -w -l logfile stop > stoplog 2>&1; echo $?}) ; print $? ? "BANG: $?\n" : "OK\n";'
0
OK


You're now testing something else, namely the return of the echo rather than the call to pg_ctl, so I don't think this is any kind of answer. It would just be ignoring the result of pg_ctl.



So does manually or or via a subshell adding another layer of shell.


As Andrew observed earlier, the issue does not occur when not performing
redirection of the output. One interesting bit there is that the perl docs for
system include:
https://perldoc.perl.org/functions/system

If there are no shell metacharacters in the argument, it is split into words
and passed directly to execvp, which is more efficient. On Windows, only the
system PROGRAM LIST syntax will reliably avoid using the shell; system LIST,
even with more than one element, will fall back to the shell if the first
spawn fails.
My guesss is that the issue somehow is triggered around the shell handling.


One relevant bit: If I use strace (from msys) within system, the subprograms
(shell and pg_ctl) actually exit with 0, from what I can tell - but 33280
still is returned. Unfortunately, if I use strace for all of perl, the error
vanishes.


Perhaps are some odd interactions with the stuff that InheritstdHandles()
does?


I observed the same thing with strace. Kind of a Heisenbug.



Andrew, is it ok if modify pg_ctl.c and rebuild? I don't know how "detached"
from the actual buildfarm animal the system you gave me access to is...


Feel free to do anything you want. This is a completely separate instance from the buildfarm animals. When we're done with this issue the EC2 instance will go away.

If you use the script just run in test mode or from-source mode, so it doesn't try to report results (that would fail anyway, as it doesn't have a registered secret). You might have to force have_ipc_run to 0. Or you can just build / install manually.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: issue with meson builds on msys2

От
Andres Freund
Дата:
Hi,

On 2023-05-15 16:01:39 -0400, Andrew Dunstan wrote:
> On 2023-05-15 Mo 15:38, Andres Freund wrote:
> > Hi,
> > 
> > On 2023-05-05 07:08:39 -0400, Andrew Dunstan wrote:
> > > If you want to play I can arrange access.
> > Andrew did - thanks!
> > 
> > 
> > A first observeration is that making the shell command slightly more
> > complicated, by echoing $? after pg_ctl, prevents the error:
> > 
> > /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D
data-C-w -l logfile stop > stoplog 2>&1;}) ; print $? ? "BANG: $?\n" : "OK\n";'
 
> > BANG: 33280
> > 
> > /usr/bin/perl -e 'system(qq{"bin/pg_ctl" -D data-C -w -l logfile start > startlog 2>&1}) ;system(qq{"bin/pg_ctl" -D
data-C-w -l logfile stop > stoplog 2>&1; echo $?}) ; print $? ? "BANG: $?\n" : "OK\n";'
 
> > 0
> > OK
> 
> 
> You're now testing something else, namely the return of the echo rather than
> the call to pg_ctl, so I don't think this is any kind of answer. It would
> just be ignoring the result of pg_ctl.

It wouldn't really - the echo $? inside the system() would report the
error. Which it doesn't - note the "0" in the second output.


> > Andrew, is it ok if modify pg_ctl.c and rebuild? I don't know how "detached"
> > from the actual buildfarm animal the system you gave me access to is...
> > 
> 
> Feel free to do anything you want. This is a completely separate instance
> from the buildfarm animals. When we're done with this issue the EC2 instance
> will go away.

Thanks!

Greetings,

Andres Freund



Re: issue with meson builds on msys2

От
Andres Freund
Дата:
Hi,

On 2023-05-15 13:13:26 -0700, Andres Freund wrote:
> It wouldn't really - the echo $? inside the system() would report the
> error. Which it doesn't - note the "0" in the second output.

Ah. Interesting. Part of the issue is perl (or msys?) swalling some error
details.

I could see more details in strace once I added another layer of shell
evaluation inside the system() call.

  190  478261 [main] bash 44432 frok::parent: CreateProcessW (C:\tools\nmsys64\usr\bin\bash.exe,
C:\tools\nmsys64\usr\bin\bash.exe,0, 0, 1, 0x420, 0, 0, 0x7FFFFBE10, 0x7FFFF
 
BDB0)
--- Process 7152 created
[...]
 1556  196093 [main] bash 44433 child_info_spawn::worker: pid 44433, prog_arg
./tmp_install/tools/nmsys64/home/pgrunner/bf/root/HEAD/inst/bin/pg_ctl,cmd line C:\tools\nmsys6
 
4\home\pgrunner\bf\root\HEAD\pgsql.build\tmp_install\tools\nmsys64\home\pgrunner\bf\root\HEAD\inst\bin\pg_ctl.exe -D t
-w-l logfile stop)
 
  128  196221 [main] bash 44433! child_info_spawn::worker: new process name
\\?\C:\tools\nmsys64\home\pgrunner\bf\root\HEAD\pgsql.build\tmp_install\tools\nmsys64\home\pgrunne
r\bf\root\HEAD\inst\bin\pg_ctl.exe
[...]
--- Process 6136 (pid: 44433) exited with status 0x0
[...]
--- Process 7152 exited with status 0xc000013a
5292450 5816310 [waitproc] bash 44432 pinfo::maybe_set_exit_code_from_windows: pid 44433, exit value - old 0x0, windows
0xC000013A,MSYS 0x8000002
 

So indeed, pg_ctl exits with 0, but bash ends up with a different exit code.

What's very interesting here is that the error is 0xC000013A, which is quite
different from the 33280 that perl then reports.  From what I can see bash
actually returns 0xC000013A - I don't know how perl ends up with 33280 /
0x8200 from that.

Either way, 0xC000013A is interesting - that's 0xC000013A,
STATUS_CONTROL_C_EXIT.


Very interestingly the problem vanishes as soon as I add a redirection for
standard input into the mix.  Notably it suffices to redirect stdin in the
pg_ctl *start*, even if not done for pg_ctl stop.  There also is no issue if
perl's stdin is redirected from /dev/null.

My guess is that msys has an issue with refcounting consoles across multiple
processes.


After that I was able to reproduce the issue without really involving perl:

bash -c './tmp_install/tools/nmsys64/home/pgrunner/bf/root/HEAD/inst/bin/pg_ctl -D t -w -l logfile start > startlog
2>&1;./tmp_install/tools/nmsys64/home/pgrunner/bf/root/HEAD/inst/bin/pg_ctl -D t -w -l logfile stop > stoplog 2>&1;
echoinner: $?'; echo outer: $?
 

+ bash -c './tmp_install/tools/nmsys64/home/pgrunner/bf/root/HEAD/inst/bin/pg_ctl -D t -w -l logfile start > startlog
2>&1;./tmp_install/tools/nmsys64/home/pgrunner/bf/root/HEAD/inst/bin/pg_ctl -D t -w -l logfile stop > stoplog 2>&1;
echoinner: $?'
 
inner: 130
+ echo outer: 0
outer: 0

If you add -e, the inner: is obviously "transferred" to the outer: output.

As soon as either the pg_ctl for the start, or the whole bash invocation, has
stdin redirected, the problem vanishes.

Greetings,

Andres Freund



Re: issue with meson builds on msys2

От
Andres Freund
Дата:
Hi,

On 2023-05-15 15:30:28 -0700, Andres Freund wrote:
> As soon as either the pg_ctl for the start, or the whole bash invocation, has
> stdin redirected, the problem vanishes.

For a moment I thought this could be related to InheritStdHandles() - but no,
it doesn't make a difference.

There's loads of handles referencing cygwin alive in pg_ctl.

Based on difference in strace output for bash -c "pg_ctl stop" for the case
where start redirected stdin (#1) and where not (#2), it looks like some part
of msys / cygwin sees that stdin is alive when preparing to execute "pg_ctl
stop", and then runs into trouble.

The way we start the child process on windows makes the use of cmd.exe for
redirection pretty odd.


I couldn't trivially reproduce this with a much simpler case (just nohup
sleep). Perhaps it's dependent on a wrapper cmd or such.


Greetings,

Andres Freund



Re: issue with meson builds on msys2

От
Andrew Dunstan
Дата:


On 2023-05-15 Mo 19:43, Andres Freund wrote:
Hi,

On 2023-05-15 15:30:28 -0700, Andres Freund wrote:
As soon as either the pg_ctl for the start, or the whole bash invocation, has
stdin redirected, the problem vanishes.
For a moment I thought this could be related to InheritStdHandles() - but no,
it doesn't make a difference.

There's loads of handles referencing cygwin alive in pg_ctl.

Based on difference in strace output for bash -c "pg_ctl stop" for the case
where start redirected stdin (#1) and where not (#2), it looks like some part
of msys / cygwin sees that stdin is alive when preparing to execute "pg_ctl
stop", and then runs into trouble.

The way we start the child process on windows makes the use of cmd.exe for
redirection pretty odd.


I couldn't trivially reproduce this with a much simpler case (just nohup
sleep). Perhaps it's dependent on a wrapper cmd or such.



I don't know where this all leaves us. It's still more than odd that the start works fine and the stop doesn't.

This piece of code has worked happily for years. It's only a recent installation or update of msys2 that's made the problem appear.

I have implemented a workaround where IPC::Run is available - that means a little extra one-off work for people using msys2, but it's not a huge burden. Beyond that I don't really want to spend a lot more energy on it.

I suppose the alternative would be to change the way the buildfarm calls pg_ctl stop. Do you have a concrete suggestion for that?


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: issue with meson builds on msys2

От
Andres Freund
Дата:
Hi,

On 2023-05-16 08:55:20 -0400, Andrew Dunstan wrote:
> I don't know where this all leaves us. It's still more than odd that the
> start works fine and the stop doesn't.

From what I understand it's just a question of starting another shell, with
some redirection, after having previously started a shell, which left a
program running (thus still referencing the same console device).


> This piece of code has worked happily for years. It's only a recent
> installation or update of msys2 that's made the problem appear.

Yea, it does look like a bug somewhere. I just don't know how to make it a
small enough reproducer right now.


> I have implemented a workaround where IPC::Run is available - that means a
> little extra one-off work for people using msys2, but it's not a huge
> burden. Beyond that I don't really want to spend a lot more energy on it.

> I suppose the alternative would be to change the way the buildfarm calls
> pg_ctl stop. Do you have a concrete suggestion for that?

The easiest fix is to redirect stdin to /dev/null (or some file, if that's
easier to do portably) - that should fix the problem entirely, without needing
IPC::Run.

Greetings,

Andres Freund



Re: issue with meson builds on msys2

От
Andrew Dunstan
Дата:


On 2023-05-16 Tu 17:52, Andres Freund wrote:

I suppose the alternative would be to change the way the buildfarm calls
pg_ctl stop. Do you have a concrete suggestion for that?
The easiest fix is to redirect stdin to /dev/null (or some file, if that's
easier to do portably) - that should fix the problem entirely, without needing
IPC::Run.


Should only be needed for the start command, right? I can probably just add "< $devnull" to the command. I'll test it out.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: issue with meson builds on msys2

От
Andres Freund
Дата:
Hi,

On May 17, 2023 2:51:41 PM PDT, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>On 2023-05-16 Tu 17:52, Andres Freund wrote:
>>
>>> I suppose the alternative would be to change the way the buildfarm calls
>>> pg_ctl stop. Do you have a concrete suggestion for that?
>> The easiest fix is to redirect stdin to /dev/null (or some file, if that's
>> easier to do portably) - that should fix the problem entirely, without needing
>> IPC::Run.
>>
>
>Should only be needed for the start command, right?

I think so.

> I can probably just add "< $devnull" to the command. I'll test it out.

Cool.

Andres


--
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: issue with meson builds on msys2

От
Andrew Dunstan
Дата:


On 2023-05-17 We 17:55, Andres Freund wrote:
Hi, 

On May 17, 2023 2:51:41 PM PDT, Andrew Dunstan <andrew@dunslane.net> wrote:
On 2023-05-16 Tu 17:52, Andres Freund wrote:
I suppose the alternative would be to change the way the buildfarm calls
pg_ctl stop. Do you have a concrete suggestion for that?
The easiest fix is to redirect stdin to /dev/null (or some file, if that's
easier to do portably) - that should fix the problem entirely, without needing
IPC::Run.

Should only be needed for the start command, right? 
I think so. 

I can probably just add "< $devnull" to the command. I'll test it out.
Cool.


OK, that seems to work. *whew*. Thanks for your help.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com