Re: Race conditions in shm_mq.c
От | Robert Haas |
---|---|
Тема | Re: Race conditions in shm_mq.c |
Дата | |
Msg-id | CA+TgmoZCsBop8vODxzrQDXrQ2hzTyKqQYKiNLCNtUTeHGAWt8w@mail.gmail.com обсуждение исходный текст |
Ответ на | Race conditions in shm_mq.c (Antonin Houska <ah@cybertec.at>) |
Ответы |
Re: Race conditions in shm_mq.c
|
Список | pgsql-hackers |
On Thu, Aug 6, 2015 at 10:10 AM, Antonin Houska <ah@cybertec.at> wrote: > During my experiments with parallel workers I sometimes saw the "master" and > worker process blocked. The master uses shm queue to send data to the worker, > both sides nowait==false. I concluded that the following happened: > > The worker process set itself as a receiver on the queue after > shm_mq_wait_internal() has completed its first check of "ptr", so this > function left sender's procLatch in reset state. But before the procLatch was > reset, the receiver still managed to read some data and set sender's procLatch > to signal the reading, and eventually called its (receiver's) WaitLatch(). > > So sender has effectively missed the receiver's notification and called > WaitLatch() too (if the receiver already waits on its latch, it does not help > for sender to call shm_mq_notify_receiver(): receiver won't do anything > because there's no new data in the queue). > > Below is my patch proposal. Another good catch. However, I would prefer to fix this without introducing a "continue" as I think that will make the control flow clearer. Therefore, I propose the attached variant of your idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: