Re: Sorting Improvements for 8.4
От | Mark Mielke |
---|---|
Тема | Re: Sorting Improvements for 8.4 |
Дата | |
Msg-id | 4768D4BE.6050700@mark.mielke.cc обсуждение исходный текст |
Ответ на | Re: Sorting Improvements for 8.4 (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: Sorting Improvements for 8.4
Re: Sorting Improvements for 8.4 |
Список | pgsql-hackers |
Jeff Davis wrote: > My first thought would be that we would need a new executor node (e.g. > "ParallelSort") that would only be chosen when the cost of the sort is > large enough to outweigh other factors (such as process creation time, > dividing available work_mem, and any necessary IPC). > > It seems to me the simplest way to do it would be to allow each sub > process to allocate work_mem/P where P is the degree of parallelization. > However, that somewhat works against our schemes for dynamic run > handling and forecasting, and may lead to more random reading from disk. > Any other scheme I can think of would involve more IPC, which might make > the idea just too complex. > I am curious - what algorithms exist to efficiently do a parallel sort? Do you mean if sorting 1 million items, it is possible to separate this into 2 sets of 500 thousand each, execute them in separate threads (with task administration and synchronization overhead) , combine the results, and complete the task in significantly less time than doing it in one thread? I am skeptical that this is possible, and suspect that the overall efficiency of the system would go down even if the throughput of a single execution increases. Or do you mean being able to perform parts of the query plan fully in parallel? If this, then one would need a lot more than ParallelSort... Cheers, mark -- Mark Mielke <mark@mielke.cc>
В списке pgsql-hackers по дате отправления: