Re: [v9.2] make_greater_string() does not return a string in some cases

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: [v9.2] make_greater_string() does not return a string in some cases
Дата	22 сентября 2011 г. 09:50:05
Msg-id	CA+Tgmoax-SHNgHe77cJZGsqgsB+Z=n_jzQZ5h0RG1+NcWGHkBg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [v9.2] make_greater_string() does not return a string in some cases (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: [v9.2] make_greater_string() does not return a string in some cases Re: [v9.2] make_greater_string() does not return a string in some cases
Список	pgsql-hackers

Дерево обсуждения

On Thu, Sep 22, 2011 at 12:24 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I'm a bit perplexed as to why we can't find a non-stochastic way of doing this.
>
> [ collations suck ]

Ugh.

> Now, having said that, I'm starting to wonder again why it's worth our
> trouble to fool with encoding-specific incrementers.  The exactness of
> the estimates seems unlikely to be improved very much by doing this.

Well, so the problem is that the frequency with which the algorithm
fails altogether seems to be disturbingly high for certain kinds of
characters.  I agree it might not be that important to get the
absolutely best next string, but it does seem important not to fail
outright.  Kyotaro Horiguchi gives the example of UTF-8 characters
ending with 0xbf.

>>>> One random idea I have is - instead of generating > and < clauses,
>>>> could we define a "prefix match" operator - i.e. a ### b iff substr(a,
>>>> 1, length(b)) = b?  We'd need to do something about the selectivity,
>>>> but I don't see why that would be a problem.
>
>>> The problem is that you'd need to make that a btree-indexable operator.
>
>> Well, right.  Without that, there's not much point.  But do you think
>> that's prohibitively difficult?
>
> The problem is that you'd just be shifting all these same issues into
> the btree index machinery, which is not any better equipped to cope with
> them, and would not be a good place to be adding overhead.

My thought was that it would avoid the need to do any character
incrementing at all.  You could just start scanning forward as if the
operator were >= and then stop when you hit the first string that
doesn't have the same initial substring.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [v9.2] make_greater_string() does not return a string in some cases