Re: KNN-GiST with recheck
От | Emre Hasegeli |
---|---|
Тема | Re: KNN-GiST with recheck |
Дата | |
Msg-id | 20141006093609.GA12899@hasegeli.com обсуждение исходный текст |
Ответ на | Re: KNN-GiST with recheck (Alexander Korotkov <aekorotkov@gmail.com>) |
Ответы |
Re: KNN-GiST with recheck
(Heikki Linnakangas <hlinnakangas@vmware.com>)
|
Список | pgsql-hackers |
> Thanks. The main question now is design of this patch. Currently, it does > all the work inside access method. We already have some discussion of pro > and cons of this method. I would like to clarify alternatives now. I can > see following way: > > 1. Implement new executor node which performs sorting by priority queue. > Let's call it "Priority queue". I think it should be separate node from > "Sort" node. Despite "Priority queue" and "Sort" are essentially similar > from user view, they would be completely different in implementation. > 2. Implement some interface to transfer distance values from access > method to "Priority queue" node. If we assume that all of them need recheck, maybe it can be done without passing distance values. > 3. Somehow tell the planner that it could use "Priority queue" in > corresponding cases. I see two ways of doing this: > - Add flag to operator in opclass indicating that index can only > order by lower bound of "col op value", not by "col op value" itself. > - Define new relation between operators. Value of one operator could > be lower bound for value of another operator. So, planner can > put "Priority > queue" node when lower bound ordering is possible from index. Also "ALTER > OPERATOR" command would be reasonable, so extensions could upgrade. I think, it would be better to make it a property of the operator class. We can add a column to pg_amop or define another value for amoppurpose on pg_amop. Syntax can be something like this: CREATE OPERATOR CLASS circle_ops DEFAULT FOR TYPE circle USING gist AS OPERATOR 15 <->(circle, point) FOR ORDER BY pg_catalog.float_opsLOWER BOUND; While looking at it, I realize that current version of the patch does not use the sort operator family defined with the operator class. It assumes that the distance function will return values compatible with the operator. Operator class definition makes me think that there is not such an assumption. > Besides overhead, this way makes significant infrastructural changes. So, > it may be over-engineering. However, it's probably more clean and beautiful > solution. > I would like to get some feedback from people familiar with KNN-GiST like > Heikki or Tom. What do you think about this? Any other ideas? I would be happy to test and review the changes. I think it is nice to solve the problem in a generalized way improving the access method infrastructure. Definitely, we should have a consensus on the design before working on the infrastructure changes.
В списке pgsql-hackers по дате отправления: