Re: Todo: Teach planner to evaluate multiple windows in the optimal order
От | Ankit Kumar Pandey |
---|---|
Тема | Re: Todo: Teach planner to evaluate multiple windows in the optimal order |
Дата | |
Msg-id | 2dfc2b6f-d05f-a11d-7732-563712c454b5@gmail.com обсуждение исходный текст |
Ответ на | Re: Todo: Teach planner to evaluate multiple windows in the optimal order (David Rowley <dgrowleyml@gmail.com>) |
Список | pgsql-hackers |
> On 16/01/23 09:48, David Rowley wrote: > I don't think we should touch this. It could significantly increase > the number of indexes that we consider when generating paths on base > relations and therefore *significantly* increase the number of paths > we consider during the join search. What I had in mind was just > making better use of existing paths to see if we can find a cheaper > way to perform the DISTINCT. That'll only possibly increase the > number of paths for the distinct upper relation which really only > increases the number of paths which are considered in > create_ordered_paths(). That's unlikely to cause much of a slowdown in > the planner. Okay, I see the issue. Makes sense > I'm seeing these two things as separate patches. I don't think there's > any need to add further complexity to the patch that tries to reduce > the number of sorts for WindowAggs. I think you'd better start a new > thread for this. Will be starting new thread for this and separate patch. > As far as I see it, you shouldn't be touching the distinct_pathkeys. > Those are set in such a way as to minimise the likelihood of an > additional sort for the ORDER BY. If you've fiddled with that, then I > imagine this is why the plan below has an additional Incremental Sort > that didn't exist before. > I've not looked at your patch, but all I imagine you need to do for it > is to invent a function in pathkeys.c which is along the lines of what > pathkeys_count_contained_in() does, but returns a List of pathkeys > which are in keys1 but not in keys2 and NIL if keys2 has a pathkey > that does not exist as a pathkey in keys1. In > create_final_distinct_paths(), you can then perform an incremental > sort on any input_path which has a non-empty return list and in > create_incremental_sort_path(), you'll pass presorted_keys as the > number of pathkeys in the path, and the required pathkeys the > input_path->pathkeys + the pathkeys returned from the new function. Okay, this should be straightforward. Let me try this. > As an optimization, you might want to consider that the > distinct_pathkeys list might be long and that the new function, if you > code the lookup as a nested loop, might be slow. You might want to > consider hashing the distinct_pathkeys once in > create_final_distinct_paths(), then for each input_path, perform a > series of hash lookups to see which of the input_path->pathkeys are in > the hash table. That might require adding two functions to pathkeys.c, > one to build the hash table and then another to probe it and return > the remaining pathkeys list. I'd go and make sure it all works as we > expect before going to the trouble of trying to optimize this. A > simple nested loop lookup will allow us to review that this works as > we expect. Okay make sense, will start with nested loop while it is in review and then optimal version once it is all good to go. Thanks, Ankit
В списке pgsql-hackers по дате отправления: