Re: Parallel CREATE INDEX for GIN indexes

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: Parallel CREATE INDEX for GIN indexes
Дата
Msg-id 87ttj7mhib.fsf@163.com
обсуждение исходный текст
Ответ на Re: Parallel CREATE INDEX for GIN indexes  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Ответы Re: Parallel CREATE INDEX for GIN indexes  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
Hello Tomas,

>>> 2) v20240502-0002-Use-mergesort-in-the-leader-process.patch
>>>
>>> The approach implemented by 0001 works, but there's a little bit of
>>> issue - if there are many distinct keys (e.g. for trigrams that can
>>> happen very easily), the workers will hit the memory limit with only
>>> very short TID lists for most keys. For serial build that means merging
>>> the data into a lot of random places, and in parallel build it means the
>>> leader will have to merge a lot of tiny lists from many sorted rows.
>>>
>>> Which can be quite annoying and expensive, because the leader does so
>>> using qsort() in the serial part. It'd be better to ensure most of the
>>> sorting happens in the workers, and the leader can do a mergesort. But
>>> the mergesort must not happen too often - merging many small lists is
>>> not cheaper than a single qsort (especially when the lists overlap).
>>>
>>> So this patch changes the workers to process the data in two phases. The
>>> first works as before, but the data is flushed into a local tuplesort.
>>> And then each workers sorts the results it produced, and combines them
>>> into results with much larger TID lists, and those results are written
>>> to the shared tuplesort. So the leader only gets very few lists to
>>> combine for a given key - usually just one list per worker.
>> 
>> Hmm, I was hoping we could implement the merging inside the tuplesort
>> itself during its own flush phase, as it could save significantly on
>> IO, and could help other users of tuplesort with deduplication, too.
>> 
>
> Would that happen in the worker or leader process? Because my goal was
> to do the expensive part in the worker, because that's what helps with
> the parallelization.

I guess both of you are talking about worker process, if here are
something in my mind:

*btbuild* also let the WORKER dump the tuples into Sharedsort struct
and let the LEADER merge them directly. I think this aim of this design
is it is potential to save a mergeruns. In the current patch, worker dump
to local tuplesort and mergeruns it and then leader run the merges
again. I admit the goal of this patch is reasonable, but I'm feeling we
need to adapt this way conditionally somehow. and if we find the way, we
can apply it to btbuild as well. 

-- 
Best Regards
Andy Fan




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: consider -Wmissing-variable-declarations
Следующее
От: Rajan Pandey
Дата:
Сообщение: