Re: Re: fix cost subqueryscan wrong parallel cost

Поиск
Список
Период
Сортировка
От Richard Guo
Тема Re: Re: fix cost subqueryscan wrong parallel cost
Дата
Msg-id CAMbWs4_QVQXaTZsUYUdqm8dumCsrDdiSF5Oatg_m7wdrZ8tWZQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Re: fix cost subqueryscan wrong parallel cost  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: fix cost subqueryscan wrong parallel cost  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers

On Fri, Apr 29, 2022 at 12:53 AM Robert Haas <robertmhaas@gmail.com> wrote:
Gather doesn't require a parallel aware subpath, just a parallel-safe
subpath. In a case like this, the parallel seq scan will divide the
rows from the underlying relation across the three processes executing
it. Each process will pass the rows it receives through its own copy
of the subquery scan. Then, the Gather node will collect all the rows
from all the workers to produce the final result.

It's an extremely important feature of parallel query that the
parallel-aware node doesn't have to be immediately beneath the Gather.
You need to have a parallel-aware node in there someplace, but it
could be separated from the gather by any number of levels e.g.

Gather
-> Nested Loop
  -> Nested Loop
    -> Nested Loop
       -> Parallel Seq Scan
       -> Index Scan
     -> Index Scan
   -> Index Scan

Thanks for the explanation. That's really helpful to understand the
parallel query mechanism.

So for the nodes between Gather and parallel-aware node, how should we
calculate their estimated rows?

Currently subquery scan is using rel->rows (if no parameterization),
which I believe is not correct. That's not the size the subquery scan
node in each worker needs to handle, as the rows have been divided
across workers by the parallel-aware node.

Using subpath->rows is not correct either, as subquery scan node may
have quals.

It seems to me the right way is to divide the rel->rows among all the
workers.

Thanks
Richard

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: SQL JSON compliance
Следующее
От: Peter Smith
Дата:
Сообщение: Re: Multi-Master Logical Replication