Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query
От | Amit Kapila |
---|---|
Тема | Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query |
Дата | |
Msg-id | CAA4eK1+6U0fOLMdMMk-iMC-6RSM+70p-9YqCVnWTEBH=V73Agg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #15324: Non-deterministic behaviour from parallelisedsub-query
|
Список | pgsql-bugs |
On Tue, Aug 14, 2018 at 9:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Marko Tiikkaja <marko@joh.to> writes: >> Marking the function parallel safe doesn't seem wrong to me. The >> non-parallel-safe part is that the input gets fed to it in different order >> in different workers. And I don't really think that to be the function's >> fault. > > So that basically opens the question of whether *any* window function > calculation can safely be pushed down to parallel workers. > I think we can consider it as a parallel-restricted operation. For the purpose of testing, I have marked row_number as parallel-restricted in pg_proc and I get the below plan: postgres=# Explain select count(*) from qwr where (a, b) in (select a, row_number() over() from qwr); QUERY PLAN -------------------------------------------------------------------------------------------------------- Aggregate (cost=46522.12..46522.13 rows=1 width=8) -> Hash Semi Join (cost=24352.08..46362.12 rows=64001 width=0) Hash Cond: ((qwr.a = qwr_1.a) AND (qwr.b = (row_number() OVER (?)))) -> Gather (cost=0.00..18926.01 rows=128002 width=8) Workers Planned: 2 -> Parallel Seq Scan on qwr (cost=0.00..18926.01 rows=64001 width=8) -> Hash (cost=21806.06..21806.06 rows=128002 width=12) -> WindowAgg (cost=0.00..20526.04 rows=128002 width=12) -> Gather (cost=0.00..18926.01 rows=128002 width=4) Workers Planned: 2 -> Parallel Seq Scan on qwr qwr_1 (cost=0.00..18926.01 rows=64001 width=4) (11 rows) This seems okay, though the results of the above parallel-execution are not same as serial-execution. I think the reason for it is that we don't get rows in predictable order from workers. > Somewhat like the LIMIT/OFFSET case, it seems to me that we could only > expect to do this safely if the row ordering induced by the WINDOW clause > can be proven to be fully deterministic. The planner has no such smarts > at the moment AFAIR. In principle you could do it if there were > partitioning/ordering by a primary key, but I'm not excited about the > prospects of that being true often enough in practice to justify making > the check. > Yeah, I am also not sure if it is worth adding the additional checks. So, for now, we can treat any window function calculation as parallel-restricted and if later anybody has a reason strong enough to relax the restriction for some particular case, we will consider it. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
В списке pgsql-bugs по дате отправления: