Re: Streaming read-ready sequential scan code
От | Alexander Lakhin |
---|---|
Тема | Re: Streaming read-ready sequential scan code |
Дата | |
Msg-id | d1fb5c09-dd03-2540-9ec2-86dbfdfa2c65@gmail.com обсуждение исходный текст |
Ответ на | Re: Streaming read-ready sequential scan code (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: Streaming read-ready sequential scan code
|
Список | pgsql-hackers |
Hello Thomas, 18.05.2024 07:47, Thomas Munro wrote: > After more debugging, we learned a lot more things... > > 1. That query produces spectacularly bad estimates, so we finish up > having to increase the number of buckets in a parallel hash join many > times. That is quite interesting, but unrelated to new code. > 2. Parallel hash join is quite slow at negotiating an increase in the > number of hash bucket, if all of the input tuples are being filtered > out by quals, because of the choice of where workers check for > PHJ_GROWTH_NEED_MORE_BUCKETS. That could be improved quite easily I > think. I have put that on my todo list 'cause that's also my code, > but it's not a new issue it's just one that is now highlighted... > 3. This bit of read_stream.c is exacerbating unfairness in the > underlying scan, so that 1 and 2 come together and produce a nasty > slowdown, which goes away if you change it like so: > > - BlockNumber blocknums[16]; > + BlockNumber blocknums[1]; > > I will follow up after some more study. Thank you for the information! Unfortunately, I can't see significant differences in my environment with parallel_leader_participation=off. With blocknums[1], timing is changed, but the effect is not persistent. 10 query15 executions in a row, b7b0f3f27: 277.932 ms 281.805 ms 278.335 ms 281.565 ms 284.167 ms 283.171 ms 281.165 ms 281.615 ms 285.394 ms 277.301 ms b7b0f3f27~1: 159.789 ms 165.407 ms 160.893 ms 159.343 ms 160.936 ms 161.577 ms 161.637 ms 163.421 ms 163.143 ms 167.109 ms b7b0f3f27 + blocknums[1]: 164.133 ms 280.920 ms 160.748 ms 163.182 ms 161.709 ms 161.998 ms 161.239 ms 276.256 ms 161.601 ms 160.384 ms I placed PGDATA on tmpfs to rule out any blockdev specifics (increasing blockdev ra from 256 to 4096 didn't help me with PGDATA on NVME either.) Best regards, Alexander
В списке pgsql-hackers по дате отправления: