Re: Fix missing EvalPlanQual recheck for TID scans
От | David Rowley |
---|---|
Тема | Re: Fix missing EvalPlanQual recheck for TID scans |
Дата | |
Msg-id | CAApHDvoFutgRZcC-b5cw-DbPsYwT62k_Ut95MRqEGKAxqHm6vw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Fix missing EvalPlanQual recheck for TID scans ("Sophie Alpert" <pg@sophiebits.com>) |
Ответы |
Re: Fix missing EvalPlanQual recheck for TID scans
Re: Fix missing EvalPlanQual recheck for TID scans |
Список | pgsql-hackers |
On Tue, 9 Sept 2025 at 04:36, Sophie Alpert <pg@sophiebits.com> wrote: > I've added a new trss_boundsInitialized flag such that we calculate the range once per EPQ rescan. In order to preservethe semantics when the min or max is NULL, I'm setting trss_mintid/trss_maxtid to have invalid ItemPointers as asentinel in the cases where TidRangeEval returns false. I added a ItemPointerIsValid assertion given that it's now morerelevant to correctness but I can remove it if it feels superfluous. Let me know if there is a more idiomatic way totreat this. This part seems a bit pointless as EPQ queries will only return 1 row anyway (Check EvalPlanQual() where it calls EvalPlanQualNext()). There will be a rescan, which will wipe out the range before the recheck function is ever called again. I think if we want to do better at not continuously recalculating the TIDList or range, then something more generic might be better. If the planner were to note down which ParamIds exist in the TID exprs, we could just mark that the TID List / Range needs to be recalculated whenever one of those parameters changes. That should help in non-EPQ cases too. For the v2 patch, I've hacked on that a bit and stripped out the trss_boundsInitialized stuff and just make it so we recalculate the TID List/Range on every recheck. I also added another isolation test permutation to have s1 rollback and ensure that s2 updates the ctid = '(0,1)' tuple. The attached v3-0001 is the updated v2 patch, and v3-0002 is a POC of what I described above. Seems there is something to it as the performance is better. It is a very contrived test case, however. create table empty(a int); create table million (a int); insert into million select generate_series(1,1000000); set max_parallel_Workers_per_gather=0; set enable_seqscan=0; set enable_material=0; set jit=0; select sum(c) from million m left join lateral (select count(*) c from empty where ctid in ('(1,1)','(1,2)','(1,3)','(1,4)','(1,5)','(1,6)','(1,7)','(1,8)','(1,9)','(1,10)','(1,11)','(1,12)','(1,13)','(1,14)','(1,15)','(1,16)','(1,17)','(1,18)','(1,19)','(1,20)','(1,21)','(1,22)','(1,23)','(1,24)','(1,25)','(1,26)','(1,27)','(1,28)','(1,29)','(1,30)','(1,31)','(1,32)','(1,33)','(1,34)','(1,35)','(1,36)','(1,37)','(1,38)','(1,39)','(1,40)','(1,41)','(1,42)','(1,43)','(1,44)','(1,45)','(1,46)','(1,47)','(1,48)','(1,49)','(1,50)','(1,51)','(1,52)','(1,53)','(1,54)','(1,55)','(1,56)','(1,57)','(1,58)','(1,59)','(1,60)','(1,61)','(1,62)','(1,63)','(1,64)','(1,65)','(1,66)','(1,67)','(1,68)','(1,69)','(1,70)','(1,71)','(1,72)','(1,73)','(1,74)','(1,75)','(1,76)','(1,77)','(1,78)','(1,79)','(1,80)','(1,81)','(1,82)','(1,83)','(1,84)','(1,85)','(1,86)','(1,87)','(1,88)','(1,89)','(1,90)','(1,91)','(1,92)','(1,93)','(1,94)','(1,95)','(1,96)','(1,97)','(1,98)','(1,99)','(1,100)')) on 1=1; master: Time: 613.541 ms Time: 621.037 ms Time: 623.430 ms master + v3-0002: Time: 298.863 ms Time: 298.015 ms Time: 297.172 ms Overall, I'm keen to get moving fixing the initial reported problem. Maybe the 0002 part can be done in master or not at all. 0002 modifies the TidScan and TidRangeScan, which we can't really do in the back branches anyway. David
Вложения
В списке pgsql-hackers по дате отправления: