Re: LIKE optimization in UTF-8 and locale-C
От | Dennis Bjorklund |
---|---|
Тема | Re: LIKE optimization in UTF-8 and locale-C |
Дата | |
Msg-id | 460362E6.2040208@zigo.dhs.org обсуждение исходный текст |
Ответ на | Re: LIKE optimization in UTF-8 and locale-C (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>) |
Список | pgsql-hackers |
ITAGAKI Takahiro skrev: >> I guess it works well for % but not for _ , the latter has to know, how >> many bytes the current (multibyte) character covers. > > Yes, % is not used in trailing bytes for all encodings, but _ is > used in some of them. I think we can use the optimization for all > of the server encodings except JOHAB. The problem with the like pattern _ is that it has to know how long the single caracter is that it should pass over. Say you have a UTF-8 string with 2 characters encoded in 3 bytes ('ÖA'). Where the first character is 2 bytes: 0xC3 0x96 'A' and now you want to match that with the LIKE pattern: '_A' How would that work in the C locale? Maybe one should simply write a special version of LIKE for the UTF-8 encoding since it's probably the most used encoding today. But I don't think you can use the C locale and that it would work for UTF-8. /Dennis
В списке pgsql-hackers по дате отправления: