Re: Race conditions in shm_mq.c
От | Robert Haas |
---|---|
Тема | Re: Race conditions in shm_mq.c |
Дата | |
Msg-id | CA+TgmobfiUgEVLAeuXn5N11yqPiwDEtdMaNKtwXpEP+abxxuOA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Race conditions in shm_mq.c (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Race conditions in shm_mq.c
|
Список | pgsql-hackers |
On Thu, Aug 6, 2015 at 2:38 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Aug 6, 2015 at 10:10 AM, Antonin Houska <ah@cybertec.at> wrote: >> During my experiments with parallel workers I sometimes saw the "master" and >> worker process blocked. The master uses shm queue to send data to the worker, >> both sides nowait==false. I concluded that the following happened: >> >> The worker process set itself as a receiver on the queue after >> shm_mq_wait_internal() has completed its first check of "ptr", so this >> function left sender's procLatch in reset state. But before the procLatch was >> reset, the receiver still managed to read some data and set sender's procLatch >> to signal the reading, and eventually called its (receiver's) WaitLatch(). >> >> So sender has effectively missed the receiver's notification and called >> WaitLatch() too (if the receiver already waits on its latch, it does not help >> for sender to call shm_mq_notify_receiver(): receiver won't do anything >> because there's no new data in the queue). >> >> Below is my patch proposal. > > Another good catch. However, I would prefer to fix this without > introducing a "continue" as I think that will make the control flow > clearer. Therefore, I propose the attached variant of your idea. Err, that doesn't work at all. Have a look at this version instead. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: