Re: LIKE optimization in UTF-8 and locale-C
От | ITAGAKI Takahiro |
---|---|
Тема | Re: LIKE optimization in UTF-8 and locale-C |
Дата | |
Msg-id | 20070323142444.6368.ITAGAKI.TAKAHIRO@oss.ntt.co.jp обсуждение исходный текст |
Ответ на | Re: LIKE optimization in UTF-8 and locale-C (ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>) |
Список | pgsql-hackers |
Dennis Bjorklund <db@zigo.dhs.org> wrote: > The problem with the like pattern _ is that it has to know how long the > single caracter is that it should pass over. Say you have a UTF-8 string > with 2 characters encoded in 3 bytes ('ÖA'). Where the first character > is 2 bytes: > > 0xC3 0x96 'A' > > and now you want to match that with the LIKE pattern: > > '_A' Thanks, it all made sense to me. My proposal was completely wrong. The optimization of MBMatchText() seems to be the right way... > Maybe one should simply write a special version of LIKE for the UTF-8 > encoding since it's probably the most used encoding today. But I don't > think you can use the C locale and that it would work for UTF-8. But then, present LIKE matching is not locale aware. we treat multi-byte characters properly, but always perform a char-by-char comparison. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center
В списке pgsql-hackers по дате отправления: