Re: POSIX shared memory redux
От | A.M. |
---|---|
Тема | Re: POSIX shared memory redux |
Дата | |
Msg-id | D9EDACF7-53F1-4355-84F8-2E74CD19D22D@themactionfaction.com обсуждение исходный текст |
Ответ на | Re: POSIX shared memory redux (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: POSIX shared memory redux
|
Список | pgsql-hackers |
Hello, Based on feedback from Tom Lane and Robert Haas, I have amended the POSIX shared memory patch to account for multiple-postmasterstart race conditions (which is currently based on SysV shared memory checks). https://github.com/agentm/postgres/tree/posix_shmem To ensure that no two postmasters can startup in the same data directory, I use fcntl range locking on the data directorylock file, which also works properly on (properly configured) NFS volumes. Whenever a postmaster or postmaster childstarts, it acquires a read (non-exclusive) lock on the data directory's lock file. When a new postmaster starts, itqueries if anything would block a write (exclusive) lock on the lock file which returns a lock-holding PID in the casewhen other postgresql processes are running. Because POSIX fcntl locking is per-process and released in the kernel when a process ends, as long as there is a single processrunning and holding a read lock, no new postmaster can be started for that data directory. Furthermore, the fcntlsyscall allows us to get a live PID for a conflicting lock-holding process, so the postgresql startup can print a livePID on a conflict startup. The contents of the data directory lock file remain the same, however, the PID stored in thelock file becomes less vital. The cost of this change is one additional file descriptor open in each postgresql process (for the full life of the process). As a gimmick, I also implemented a process-failover feature based on the F_SETLKW flag which allows a new postmaster to startupimmediately if the running postmaster and all its children exit for any reason. This may also be useful to queue postmasterstartup which could be controlled by a secondary non-postgresql process, pending some action (such as in a failoverscenario). This feature is controlled via "postgres -b" (for "blocking"), but it is not vital to the shared memoryimplementation. Note that this implementation of the fcntl locking is effectively independent of the shared memory interface, i.e. this samelocking could be used with the existing SysV shared memory scheme. Is this approach good enough to push to the next CommitFest? I am happy to amend the patch as necessary. Thanks! Cheers, M
Вложения
В списке pgsql-hackers по дате отправления: