Re: Rough draft for Unicode-aware UPPER()/LOWER()/INITCAP()

Поиск
Список
Период
Сортировка
От Marko Karppinen
Тема Re: Rough draft for Unicode-aware UPPER()/LOWER()/INITCAP()
Дата
Msg-id ADBE8746-A78D-11D8-9207-000A95C56374@karppinen.fi
обсуждение исходный текст
Ответ на Re: Rough draft for Unicode-aware UPPER()/LOWER()/INITCAP()  (Peter Eisentraut <peter_e@gmx.net>)
Список pgsql-hackers
> Marko Karppinen wrote:
>> I think this interaction between the locale and server_encoding is
>> confusing. Is there any use case for running an incompatible mix?
>> If not, would it not make sense to fetch initdb's default database
>> encoding with nl_langinfo(CODESET) instead of using SQL_ASCII?

Peter Eisentraut wrote:
> This would be fine and dandy if we had any sort of idea about what sort
> of strings nl_langinfo(CODESET) returns and how to map them to our
> encoding names.

Karel Zak posted an answer to this last year, here on pgsql-hackers:
http://archives.postgresql.org/pgsql-hackers/2003-05/msg00744.php
It's not complete, but it's sort of an idea.

The code is under LGPL, but copyright doesn't reach down to the
actual information about the encoding strings used by various
operating systems, so it's possible to reappropriate. I'd imagine
that it covers many, if not most, of the likely cases.

The current situation of upper/lower/collating/etc just being
broken by default on many non-C locales is bad enough to warrant
bailing out during initdb when this situation is detected
(with a reasonably cautious heuristic).

It used to be that you got what you deserved if you were stupid
enough to define a non-C, non-ASCII-based locale. You had only
yourself to blame for everything breaking. These days, however,
millions of systems get shipped and installed with UTF-8 locales
on by default, so it's not possible to portray this as an user error.

Requiring every one of these people to configure initdb's encoding
manually would be harsh, however, so I think that an heuristic
that'd work with most modern systems would strike an appropriate
balance of correctness and path-of-least-surprise.

mk



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Gaetano Mendola
Дата:
Сообщение: Re: Email data type
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Call for 7.5 feature completion