Re: Sorting Improvements for 8.4

Поиск

Список

Период

Сортировка

От	Mark Mielke
Тема	Re: Sorting Improvements for 8.4
Дата	19 декабря 2007 г. 04:24:04
Msg-id	4768D4BE.6050700@mark.mielke.cc обсуждение исходный текст
Ответ на	Re: Sorting Improvements for 8.4 (Jeff Davis <pgsql@j-davis.com>)
Ответы	Re: Sorting Improvements for 8.4 Re: Sorting Improvements for 8.4
Список	pgsql-hackers

Дерево обсуждения

Jeff Davis wrote:
> My first thought would be that we would need a new executor node (e.g.
> "ParallelSort") that would only be chosen when the cost of the sort is
> large enough to outweigh other factors (such as process creation time,
> dividing available work_mem, and any necessary IPC).
>
> It seems to me the simplest way to do it would be to allow each sub
> process to allocate work_mem/P where P is the degree of parallelization.
> However, that somewhat works against our schemes for dynamic run
> handling and forecasting, and may lead to more random reading from disk.
> Any other scheme I can think of would involve more IPC, which might make
> the idea just too complex.
>   
I am curious - what algorithms exist to efficiently do a parallel sort? 
Do you mean if sorting 1 million items, it is possible to separate this 
into  2 sets of 500 thousand each, execute them in separate threads 
(with task administration and synchronization overhead) , combine the 
results, and complete the task in significantly less time than doing it 
in one thread? I am skeptical that this is possible, and suspect that 
the overall efficiency of the system would go down even if the 
throughput of a single execution increases.

Or do you mean being able to perform parts of the query plan fully in 
parallel? If this, then one would need a lot more than ParallelSort...

Cheers,
mark

-- 
Mark Mielke <mark@mielke.cc>

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Sorting Improvements for 8.4