PATCH: Keep one postmaster monitoring pipe per process
От | Marco Pfatschbacher |
---|---|
Тема | PATCH: Keep one postmaster monitoring pipe per process |
Дата | |
Msg-id | 20160915135755.GC19008@genua.de обсуждение исходный текст |
Ответы |
Re: PATCH: Keep one postmaster monitoring pipe per process
Re: PATCH: Keep one postmaster monitoring pipe per process Re: PATCH: Keep one postmaster monitoring pipe per process |
Список | pgsql-hackers |
Hi, the current implementation of PostmasterIsAlive() uses a pipe to monitor the existence of the postmaster process. One end of the pipe is held open in the postmaster, while the other end is inherited to all the auxiliary and background processes when they fork. This leads to multiple processes calling select(2), poll(2) and read(2) on the same end of the pipe. While this is technically perfectly ok, it has the unfortunate side effect that it triggers an inefficient behaviour[0] in the select/poll implementation on some operating systems[1]: The kernel can only keep track of one pid per select address and thus has no other choice than to wakeup(9) every process that is waiting on select/poll. In our case the system had to wakeup ~3000 idle ssh processes every time postgresql did call PostmasterIsAlive. WalReceiver did run trigger with a rate of ~400 calls per second. With the result that the system performs very badly, being mostly busy scheduling idle processs. Attached patch avoids the select contention by using a separate pipe for each auxiliary and background process. Since the postmaster has three different ways to create new processes, the patch got a bit more complicated than I anticipated :) For auxiliary processes, pgstat, pgarch and the autovacuum launcher get a preallocated pipe each. The pipes are held in: postmaster_alive_fds_own[NUM_AUXPROCTYPES]; postmaster_alive_fds_watch[NUM_AUXPROCTYPES]; Just before we fork a new process we set postmaster_alive_fd for each process type: postmaster_alive_fd = postmaster_alive_fds_watch[type]; Since there can be multiple backend processes, BackendStarup() allocates a pipe on-demand and keeps the reference in the Backend structure. And is closed when the backend terminates. The patch was developed and tested under OpenBSD using the REL9_4_STABLE branch. I've merged it to current, compile tested and ran make check on Ubuntu 14.04. Marco [0] http://man.openbsd.org/OpenBSD-5.9/man2/select.2?manpath=OpenBSD-5.9 BUGS [...] "Internally to the kernel, select() and pselect() work poorly if multiple processes wait on the same file descriptor. Given that, it is rather surprising to see that many daemons are written that way." [1] At least OpenBSD and NetBSD are affected, FreeBSD rewrote their select implementation in 8.0.
Вложения
В списке pgsql-hackers по дате отправления: