Re: [HACKERS] [POC] Faster processing at Gather node
От | Amit Kapila |
---|---|
Тема | Re: [HACKERS] [POC] Faster processing at Gather node |
Дата | |
Msg-id | CAA4eK1JExaqaRgT=UwiXqBCj8NhROJefDVO2BVwoNp6w4APYCA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] [POC] Faster processing at Gather node (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-hackers |
On Fri, May 19, 2017 at 5:58 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, May 19, 2017 at 7:55 AM, Rafia Sabih > <rafia.sabih@enterprisedb.com> wrote: >> While analysing the performance of TPC-H queries for the newly developed >> parallel-operators, viz, parallel index, bitmap heap scan, etc. we noticed >> that the time taken by gather node is significant. On investigation, as per >> the current method it copies each tuple to the shared queue and notifies the >> receiver. Since, this copying is done in shared queue, a lot of locking and >> latching overhead is there. >> >> So, in this POC patch I tried to copy all the tuples in a local queue thus >> avoiding all the locks and latches. Once, the local queue is filled as per >> it's capacity, tuples are transferred to the shared queue. Once, all the >> tuples are transferred the receiver is sent the notification about the same. > > What if, instead of doing this, we switched the shm_mq stuff to use atomics? > That is one of the very first things we have tried, but it didn't show us any improvement, probably because sending tuple-by-tuple over shm_mq is not cheap. Also, independently, we have tried to reduce the frequency of SetLatch (used to notify receiver), but that also didn't result in improving the results. Now, I think one thing that can be tried is to use atomics in shm_mq and reduce the frequency to notify receiver, but not sure if that can give us better results than with this idea. There are a couple of other ideas which has been tried to improve the speed of Gather like avoiding an extra copy of tuple which we need to do before sending tuple (tqueueReceiveSlot->ExecMaterializeSlot) and increasing the size of tuple queue length, but none of those has shown any noticeable improvement. I am aware of all this because I and Dilip were offlist involved in brainstorming ideas with Rafia to improve the speed of Gather. I think it might have been better to show the results of ideas that didn't work out, but I guess Rafia hasn't shared those with the intuition that nobody would be interested in hearing what didn't work out. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: