Re: Database object names and libpq in UTF-8 locale on Windows

Поиск
Список
Период
Сортировка
От Sebastien FLAESCH
Тема Re: Database object names and libpq in UTF-8 locale on Windows
Дата
Msg-id 50ADFDEB.4050103@4js.com
обсуждение исходный текст
Ответ на Re: Database object names and libpq in UTF-8 locale on Windows  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Tom, Andrew,

We have the same issue in our product: Support UTF-8 on Windows.

You know certainly that UTF-8 code page (65001) is no supported by MS Windows
when you set the locale with setlocale(). You cannot rely on standard libc
functions such as isalpha(), mbtowc(), mbstowc(), wctomb(), wcstombs(),
strcoll(), which depend on the current locale.

You should start to centralize all basic character-set related functions
(upper/lower, comparison, etc) in a library, to ease the port on Windows.

Then convert UTF-8 data to wide char and call wide char functions.

For example, to implement an uppercase() function:

1) Convert UTF-8 to Wide Char (algorithm can be easily found)
2) Use towupper()
3) Convert Wide Char result to UTF-8 (algorithm can be easily found)

To compare characters:

1) Convert s1 in UTF-8 to Wide Char => wcs1
2) Convert s2 in UTF-8 to Wide Char => wcs2
3) Use wcscoll(wcs1, wcs2)

Regards,
Seb

On 11/21/2012 06:07 PM, Tom Lane wrote:
> Andrew Dunstan<andrew@dunslane.net>  writes:
>> On 11/21/2012 11:11 AM, Tom Lane wrote:
>>> I'm not sure that's the only place we're doing this ...
>
>> Oh, Hmm, darn. Where else do you think we might?
>
> Dunno, but grepping for isupper and/or tolower should find any such
> places.
>
>             regards, tom lane
>




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Chen Huajun
Дата:
Сообщение: fix ecpg core dump when there's a very long struct variable name in .pgc file
Следующее
От: Pavel Stehule
Дата:
Сообщение: review: Deparsing DDL command strings