Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
От | Tom Lane |
---|---|
Тема | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash |
Дата | |
Msg-id | 8525.1576007382@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
|
Список | pgsql-bugs |
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > As for the performance impact, I did this: > create table dim (id int, val text); > insert into dim select i, md5(i::text) from generate_series(1,1000000) s(i); > create table fact (id int, val text); > insert into fact select mod(i,1000000)+1, md5(i::text) from generate_series(1,25000000) s(i); > set max_parallel_workers_per_gather = 0; > select count(*) from fact join dim using (id); > So a perfectly regular join between 1M and 25M table. On my machine, > this takes ~8851ms on master and 8979ms with the patch (average of about > 20 runs with minimal variability). That's ~1.4% regression, so a bit > more than the 0.4% mentioned before. Not a huge difference though, and > some of it might be due to different binary layout etc. Hmm ... I replicated this experiment here, using my usual precautions to get more-or-less-reproducible numbers [1]. I concur that the patch seems to be slower, but only by around half a percent on the median numbers, which is much less than the run-to-run variation. So that would be fine --- except that in my first set of runs, I forgot the "set max_parallel_workers_per_gather" step and hence tested this same data set with a parallel hash join. And in that scenario, I got a repeatable slowdown of around 7.5%, which is far above the noise floor. So that's not good --- why does this change make PHJ worse? regards, tom lane [1] https://www.postgresql.org/message-id/31686.1574722301%40sss.pgh.pa.us
В списке pgsql-bugs по дате отправления: