File descriptors in exec'd subprocesses
От | Thomas Munro |
---|---|
Тема | File descriptors in exec'd subprocesses |
Дата | |
Msg-id | CA+hUKGKb6FsAdQWcRL35KJsftv+9zXqQbzwkfRf1i0J2e57+hQ@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: File descriptors in exec'd subprocesses
(Thomas Munro <thomas.munro@gmail.com>)
Re: File descriptors in exec'd subprocesses (Andres Freund <andres@anarazel.de>) |
Список | pgsql-hackers |
While investigating code paths that use system() and popen() and trying to write latch-multiplexing replacements, which I'll write about separately, leaks of $SUBJECT became obvious. On a FreeBSD box, you can see the sockets, pipes and data and WAL files that the subprocess inherits: create table t (x text); copy t from program 'procstat -f $$'; select * from t; (On Linux, you might try something like 'ls -slap /proc/self/fd'. Not sure how to do it on a Mac but note that 'lsof' is no good for this purpose because the first thing it does is close all descriptors > 2; maybe 'lsof -p $PPID' inside a shell wrapper so you're analysing the shell process that sits between postgres and lsof, rather than lsof itself?) Since we've started assuming a few other bits of SUSv3 (POSIX 2008) functionality, the standard that specified O_CLOEXEC, and since in practice Linux, *BSD, macOS, AIX, Solaris, illumos all have it, I think we can unconditionally just use it on all files we open. That is, if we were to make fallback code, it would be untested, and if we were to do it with fcntl() always it would be a frequent extra system call that we don't need to support a computer that doesn't exist. For sockets and pipes, much more rarely created, some systems have non-standard extensions along the same lines, but I guess we should stick with standards and call fcntl(FD_CLOEXEC) for now. There is a place in fd.c that already referenced O_CLOEXEC (it wasn't really using it, just making an assertion that flags don't collide), with #ifdef around it, but that was only conditional because at the time of commit 04cad8f7 we had a macOS 10.4 system (released 2005) in the 'farm which obviously didn't know about POSIX 2008 interfaces. We can just remove that #ifdef. (It's probably OK to remove the test of O_DSYNC too but I'll think about that another time.) On Windows, handles, at least as we create them, are not inherited so the problem doesn't come up AFAICS. I *think* if we were to use Windows' own open(), that would be an issue, but we have our own CreateFile() thing and it doesn't turn on inheritance IIUC. So I just gave O_CLOEXEC a zero definition there. It would be interesting to know what handles a subprocess sees. If someone who knows how to drive Windows could run a subprogram that just does the equivalent of 'sleep 60' they might be able to see that in one of those handle spy tools, to visually check the above. (If I'm wrong about that, it might be possible for a subprocess to interfere with a ProcSignalBarrier command to close all files, so I'd love to know about it.) We were already doing FD_CLOEXEC on the latch self-pipe with comment "we surely do not want any child processes messing with them", so it's not like this wasn't a well-known issue before, but I guess it just never bothered anyone enough to do anything about the more general problem. With the attached, the test at the top of this email shows only in, out, error, and one thing that procstat opened itself. Are there any more descriptors we need to think about?
Вложения
В списке pgsql-hackers по дате отправления: