Re: BUG #15014: pg_trgm regexp with wchar not good?
От | Tom Lane |
---|---|
Тема | Re: BUG #15014: pg_trgm regexp with wchar not good? |
Дата | |
Msg-id | 18067.1516288553@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | BUG #15014: pg_trgm regexp with wchar not good? (PG Bug reporting form <noreply@postgresql.org>) |
Список | pgsql-bugs |
=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes: > when i use pg_trgm's gin index, with wchar search, it's not good for regexp, > but good for like express. pg_trgm is going to ignore characters that it doesn't think are letters or digits. Don't know if the characters you are working with are considered letters in en_US locale, but if they aren't, that would likely result in no usable trigrams in this string. Another issue is that "trigrams" are three *bytes* not three characters, so the useful information per trigram is a lot lower when working with many-byte characters; that could also lead to an index search being much less selective than you'd hope. You might learn something by looking at the result of show_trgm() for these strings, but I'm thinking there's no bug here, just design limitations of the trigram approach. regards, tom lane
В списке pgsql-bugs по дате отправления: