Обсуждение: [HACKERS] GiST API Adancement
Hi, Dong! 2017-06-15 21:19 GMT+05:00 Yuan Dong <Doffery@hotmail.com>: > I'm going to hack on my own. With the help of Andrew Borodin, I want to > start the project with adding a third state to collision check. The third > state is that: > subtree is totally within the query. In this case, GiST scan can scan down > subtree without key checks. That's fine! > After reading some code, I get this plan to modify the code: > > 1. Modify the consistent function of datatypes supported by GiST. > > 1.1 Start with cube, g_cube_consistent should return a third state when > calling g_cube_internal_consistent. Inspired by Andrew Borodin, I have two > solutions, 1)modify return value, 2) modify a reference type parameter. > > 2. Modify the gistindex_keytest in gistget.c to return multiple states > > 2.1 Need declare a new enum type(or define values) in gist_private.h or > somewhere > > 3. Modify the gitsScanPage in gistget.c to deal with the extra state > > 4. Add a state to mark the nodes under this GISTSearchItem are all with in > the query > > 4.1 Two ways to do this: 1) add a state to GISTSearchItem (prefered) 2) > Somewhere else to record all this kind of items > > 4.2 We may use the block number as the key > > 4.3 Next time when the gistScanPage met this item, just find all the leaves > directly(need a new function) > > After this, I shall start to benchmark the improvement and edit the code of > other datatypes. > > Hope you hackers can give me some suggestions~ I think there is one more aspect of development: backward compatibility: it's impossible to update all existing extensions. This is not that major feature to ignore them. Though for benchmarking purposes backward compatibility can be omitted. Best regards, Andrey Borodin
Hi Andrey,
Thank you for your suggestion.
I should still maintain original API of GiST after modification.
--
Dong
发送时间: 2017年6月16日 6:24:03
收件人: Yuan Dong
抄送: pgsql-hackers@postgresql.org
主题: Re: GiST API Adancement
2017-06-15 21:19 GMT+05:00 Yuan Dong <Doffery@hotmail.com>:
> I'm going to hack on my own. With the help of Andrew Borodin, I want to
> start the project with adding a third state to collision check. The third
> state is that:
> subtree is totally within the query. In this case, GiST scan can scan down
> subtree without key checks.
That's fine!
> After reading some code, I get this plan to modify the code:
>
> 1. Modify the consistent function of datatypes supported by GiST.
>
> 1.1 Start with cube, g_cube_consistent should return a third state when
> calling g_cube_internal_consistent. Inspired by Andrew Borodin, I have two
> solutions, 1)modify return value, 2) modify a reference type parameter.
>
> 2. Modify the gistindex_keytest in gistget.c to return multiple states
>
> 2.1 Need declare a new enum type(or define values) in gist_private.h or
> somewhere
>
> 3. Modify the gitsScanPage in gistget.c to deal with the extra state
>
> 4. Add a state to mark the nodes under this GISTSearchItem are all with in
> the query
>
> 4.1 Two ways to do this: 1) add a state to GISTSearchItem (prefered) 2)
> Somewhere else to record all this kind of items
>
> 4.2 We may use the block number as the key
>
> 4.3 Next time when the gistScanPage met this item, just find all the leaves
> directly(need a new function)
>
> After this, I shall start to benchmark the improvement and edit the code of
> other datatypes.
>
> Hope you hackers can give me some suggestions~
I think there is one more aspect of development: backward
compatibility: it's impossible to update all existing extensions. This
is not that major feature to ignore them.
Though for benchmarking purposes backward compatibility can be omitted.
Best regards, Andrey Borodin
Yuan Dong <Doffery@hotmail.com> writes: > ·¢¼þÈË: Andrew Borodin <borodin@octonica.com> >> I think there is one more aspect of development: backward >> compatibility: it's impossible to update all existing extensions. This >> is not that major feature to ignore them. > I should still maintain original API of GiST after modification. To put a little context on that: we have always felt that it's okay to require extension authors to make small adjustments when going to a new major release. For instance, adding a new parameter to a globally visible function is fine, especially if callers can just pass NULL or some such to get the old behavior. So in the context here, you shouldn't feel compelled to come up with a bizarre API design just to preserve exact compatibility of old code. You should indeed think about reducing the amount of work that extension authors have to do to update, but that doesn't have to mean "zero". Also, it's wise to make sure that any places where code changes have to be made will result in compile errors if the change isn't made. regards, tom lane
2017-06-16 17:06 GMT+03:00 Tom Lane <tgl@sss.pgh.pa.us>: > Yuan Dong <Doffery@hotmail.com> writes: >> ·¢¼þÈË: Andrew Borodin <borodin@octonica.com> >>> I think there is one more aspect of development: backward >>> compatibility: it's impossible to update all existing extensions. This >>> is not that major feature to ignore them. > >> I should still maintain original API of GiST after modification. > > To put a little context on that: we have always felt that it's okay > to require extension authors to make small adjustments when going to > a new major release. Then maybe it worth to consider more general API advancement at once? Here's my wish list: 1. Choose subtree function as an alternative for penalty 2. Bulk load support 3. Better split framework: now many extensions peek two seeds and spread split candidates among them, some with subtle but noisy mistakes 4. Better way of key compares (this thread, actually) 5. Split in many portions Best regards, Andrey Borodin.
2017-06-16 17:06 GMT+03:00 Tom Lane <tgl@sss.pgh.pa.us>:
> Yuan Dong <Doffery@hotmail.com> writes:
>> ·¢¼þÈË: Andrew Borodin <borodin@octonica.com>
>>> I think there is one more aspect of development: backward
>>> compatibility: it's impossible to update all existing extensions. This
>>> is not that major feature to ignore them.
>
>> I should still maintain original API of GiST after modification.
>
> To put a little context on that: we have always felt that it's okay
> to require extension authors to make small adjustments when going to
> a new major release.
Tom's comment is very helpful, I will keep it in mind.
I do not understand the problem of choosing subtree as an alternative for penalty. What do yo mean by choosing subtree function?
There will be more questions coming, hope you will be comfortable with that~
Best wishes.
--
Dong
发送时间: 2017年6月16日 16:48:04
收件人: Tom Lane
抄送: Yuan Dong; pgsql-hackers@postgresql.org
主题: Re: [HACKERS] 答复: GiST API Adancement
> Yuan Dong <Doffery@hotmail.com> writes:
>> ·¢¼þÈË: Andrew Borodin <borodin@octonica.com>
>>> I think there is one more aspect of development: backward
>>> compatibility: it's impossible to update all existing extensions. This
>>> is not that major feature to ignore them.
>
>> I should still maintain original API of GiST after modification.
>
> To put a little context on that: we have always felt that it's okay
> to require extension authors to make small adjustments when going to
> a new major release.
Then maybe it worth to consider more general API advancement at once?
Here's my wish list:
1. Choose subtree function as an alternative for penalty
2. Bulk load support
3. Better split framework: now many extensions peek two seeds and
spread split candidates among them, some with subtle but noisy
mistakes
4. Better way of key compares (this thread, actually)
5. Split in many portions
Best regards, Andrey Borodin.