Re: multibyte-character aware support for function "downcase_truncate_identifier()"
От | Tom Lane |
---|---|
Тема | Re: multibyte-character aware support for function "downcase_truncate_identifier()" |
Дата | |
Msg-id | 26799.1290375695@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: multibyte-character aware support for function "downcase_truncate_identifier()" (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: multibyte-character aware support for function "downcase_truncate_identifier()"
|
Список | pgsql-hackers |
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Jul 7, 2010 at 10:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> IIRC this is intentional. �Please consult the archives for previous >> discussions. > Why would this be intentional? Well, it's intentional for lack of any infrastructure that would allow a more spec-compliant approach. As you say, calling str_tolower here is probably a non-starter for performance reasons. Another big problem is that str_tolower produces a locale-specific downcasing conversion. This (a) is going to create portability headaches of the first magnitude, and (b) is not really an advance in terms of spec compliance. The SQL spec says that identifier case folding should be done according to the Unicode standard, but it's not safe to assume that any random platform-specific locale is going to act that way. A specific example of a locale that is known to NOT behave acceptably is Turkish: they have weird ideas about i versus I, which in fact broke things back when we used to use tolower for this purpose. See the archives from early 2004, and in particular commit 59f9a0b9df0d224bb62ff8ec5b65e0b187655742, which removed the exact same logic (though not wide-character-aware) that this patch proposes to put back. I think the given patch can be rejected out of hand. If the OP has any ideas about doing non-locale-dependent case folding at an acceptable speed, I'm happy to listen. regards, tom lane
В списке pgsql-hackers по дате отправления: