Re: Unicode Normalization
От | David E. Wheeler |
---|---|
Тема | Re: Unicode Normalization |
Дата | |
Msg-id | 9BD6C83B-018E-4263-9EC8-33344FEDF655@kineticode.com обсуждение исходный текст |
Ответ на | Unicode Normalization ("David E. Wheeler" <david@kineticode.com>) |
Ответы |
Re: Unicode Normalization
|
Список | pgsql-hackers |
On Sep 24, 2009, at 6:24 AM, pg@thetdh.com wrote: > In a context using normalization, wouldn't you typically want to > store a normalized-text type that could perhaps (depending on > locale) take advantage of simpler, more-efficient comparison > functions? That might be nice, but I'd be wary of a geometric multiplication of text types. We already have TEXT and CITEXT; what if we had your NTEXT (normalized text) but I wanted it to also be case-insensitive? > Whether you're doing INSERT/UPDATE, or importing a flat text file, > if you canonicalize characters and substrings of identical meaning > when trivial distinctions of encoding are irrelevant, you're better > off later. User-invocable normalization functions by themselves > don't make much sense. Well, they make sense because there's nothing else right now. It's an easy way to get some support in, and besides, it's mandated by the SQL standard. > (If Postgres now supports binary- or mixed-binary-and-text flat > files, perhaps for restore purposes, the same thing applies.) Don't follow this bit. Best, David
В списке pgsql-hackers по дате отправления: