Re: PATCH: CITEXT 2.0
От | David E. Wheeler |
---|---|
Тема | Re: PATCH: CITEXT 2.0 |
Дата | |
Msg-id | 8E2D49F2-E366-4504-9428-0AB6F35468FA@kineticode.com обсуждение исходный текст |
Ответ на | Re: PATCH: CITEXT 2.0 (Zdenek Kotala <Zdenek.Kotala@Sun.COM>) |
Список | pgsql-hackers |
On Jul 7, 2008, at 12:46, Zdenek Kotala wrote: >> So, the upshot is that the = and <> operators are not locale-aware, >> yes? They just do byte comparisons. Is that really the way it >> should be? I mean, could there not be strings that are equivalent >> but have different bytes? > > Correct. The problem is complex. It works fine only for normalized > string. But postgres now assume that all utf8 strings are normalized. I see. So binary equivalence is okay, in that case. > If you need to implement < <= >= > operators you need to use strcol > which take care of locale collation. Which varstr_cmp() does, I guess. It's what textlt uses, for example. > See unicode collation algorithm http://www.unicode.org/reports/tr10/ Wow, that looks like a fun read. Best, David
В списке pgsql-hackers по дате отправления: