Character Conversions Handling
От | Volkan YAZICI |
---|---|
Тема | Character Conversions Handling |
Дата | |
Msg-id | 7104a7370510181229p72d4d340j348d8e5f04a3ea34@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Character Conversions Handling
|
Список | pgsql-hackers |
Hi, I'm trying to understand the schema laying behind backend/utils/adt/like.c to downcase letters [1]. When I look at the other tolower() implementations, there're lots of them spread around. (In interfaces/libpq, backend/regex, backend/utils/adt/like and etc.) For example, despite having pg_wc_tolower() function in regc_locale.c, achieving same with manually in iwchareq() of like.c. I'd so appreciated if somebody can point me the places where I should start to look at to understand the character handling with different encodings. Also, I wonder why didn't we use any btow/mbsrtowc/wctomb like functions. Is this for portability with other compilers? [1] iwchareq() is using pg_mb2wchar_with_len() which decides the right mb2wchar function from pg_wchar_table. When I look at backend/mb/wchar.c there're some other specific to locale mblen and mb2wchar routines. For example, EUC_KR is handled with pg_euc2wchar_with_len() function, but LATIN5 is handled with pg_latin12wchar_with_len() function. Will we write a new function for latin5 like pg_latin52wchar_with_len() if we'd encounter with a new problem with latin5? Regards.
В списке pgsql-hackers по дате отправления: