Re: UTF8MatchText
От | Andrew Dunstan |
---|---|
Тема | Re: UTF8MatchText |
Дата | |
Msg-id | 464CA0C2.4010700@dunslane.net обсуждение исходный текст |
Ответ на | Re: UTF8MatchText (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: UTF8MatchText
|
Список | pgsql-patches |
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> Tom Lane wrote: >> >>> Wait a second ... I just thought of a counterexample that destroys the >>> entire concept. Consider the pattern 'A__B', which clearly is supposed >>> to match strings of four *characters*. With the proposed patch in >>> place, it would match strings of four *bytes*. Which is not the correct >>> behavior. >>> > > >> From what I can see the code is quite careful about when it calls >> NextByte vs NextChar, and after _ it calls NextChar. >> > > Except that the entire point of this patch is to dumb down NextChar to > be the same as NextByte for UTF8 strings. > > > That's not what I see in (what I think is) the latest submission, which includes this snippet: + /* Set up for utf8 characters */ + #define CHAREQ(p1, p2) wchareq(p1, p2) + #define NextChar(p, plen) \ + do { int __l = pg_utf_mblen(p); (p) +=__l; (plen) -=__l; } while (0) + + /* + * UTF8MatchText -- specialized version of MBMatchText for UTF8 + */ + static int + UTF8MatchText(char *t, int tlen, char *p, int plen) Am I looking at the wrong thing? This is from around April 9th I think. cheers andrew
В списке pgsql-patches по дате отправления: