Re: [v9.2] make_greater_string() does not return a string in some cases
От | Robert Haas |
---|---|
Тема | Re: [v9.2] make_greater_string() does not return a string in some cases |
Дата | |
Msg-id | CA+Tgmoax-SHNgHe77cJZGsqgsB+Z=n_jzQZ5h0RG1+NcWGHkBg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [v9.2] make_greater_string() does not return a string in some cases (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [v9.2] make_greater_string() does not return a string in some cases
Re: [v9.2] make_greater_string() does not return a string in some cases |
Список | pgsql-hackers |
On Thu, Sep 22, 2011 at 12:24 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I'm a bit perplexed as to why we can't find a non-stochastic way of doing this. > > [ collations suck ] Ugh. > Now, having said that, I'm starting to wonder again why it's worth our > trouble to fool with encoding-specific incrementers. The exactness of > the estimates seems unlikely to be improved very much by doing this. Well, so the problem is that the frequency with which the algorithm fails altogether seems to be disturbingly high for certain kinds of characters. I agree it might not be that important to get the absolutely best next string, but it does seem important not to fail outright. Kyotaro Horiguchi gives the example of UTF-8 characters ending with 0xbf. >>>> One random idea I have is - instead of generating > and < clauses, >>>> could we define a "prefix match" operator - i.e. a ### b iff substr(a, >>>> 1, length(b)) = b? We'd need to do something about the selectivity, >>>> but I don't see why that would be a problem. > >>> The problem is that you'd need to make that a btree-indexable operator. > >> Well, right. Without that, there's not much point. But do you think >> that's prohibitively difficult? > > The problem is that you'd just be shifting all these same issues into > the btree index machinery, which is not any better equipped to cope with > them, and would not be a good place to be adding overhead. My thought was that it would avoid the need to do any character incrementing at all. You could just start scanning forward as if the operator were >= and then stop when you hit the first string that doesn't have the same initial substring. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: