Re: Latest on CITEXT 2.0
От | Bruce Momjian |
---|---|
Тема | Re: Latest on CITEXT 2.0 |
Дата | |
Msg-id | 200807011525.m61FP7221773@momjian.us обсуждение исходный текст |
Ответ на | Re: Latest on CITEXT 2.0 ("Marko Kreen" <markokr@gmail.com>) |
Ответы |
Re: Latest on CITEXT 2.0
|
Список | pgsql-hackers |
Marko Kreen wrote: > On 7/1/08, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > "Marko Kreen" <markokr@gmail.com> writes: > > > On 6/26/08, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > >> BTW, I don't think you can use that same-length optimization for > > >> citext. There's no reason to think that upper/lowercase pairs will > > >> have the same length all the time in multibyte encodings. > > > > > What about this code in current str_tolower(): > > > > > /* Output workspace cannot have more codes than input bytes */ > > > workspace = (wchar_t *) palloc((nbytes + 1) * sizeof(wchar_t)); > > > > > > That's working with wchars, not bytes. > > Ah, I missed the point of char2wchar() line. > > I'm rather unfamiliar with various MB API-s, sorry. > > There's another thing I'm probably missing: does current code handle > multi-wchar codepoints? Or is it guaranteed they don't happen? > (Wasn't wchar_t usually 16bit value?) If you want a simple example of wide character use look at oracle_compat.c::upper() which calls str_toupper() in CVS HEAD. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
В списке pgsql-hackers по дате отправления: