Re: BUG #15347: Unaccent for greek characters does not work
От | Thomas Munro |
---|---|
Тема | Re: BUG #15347: Unaccent for greek characters does not work |
Дата | |
Msg-id | CAEepm=0F3pv9A3_pe=jQMCS9b-iUPjEQjzoftNJjN8FHwXHeKA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #15347: Unaccent for greek characters does not work (Tasos Maschalidis <tas.o.s@hotmail.com>) |
Список | pgsql-bugs |
On Fri, Aug 24, 2018 at 10:47 AM, Tasos Maschalidis <tas.o.s@hotmail.com> wrote: > The results are legit for all vowels. Cool. > There is only one thing missing which > I guess does fall into unaccent functionality. When an "σ" is used as the > last letter of any word, it changes to "s" grammatically, unless the whole > word is capitals, then it stays the same ("Σ"), even at the end of the word. > In searches it s useful to convert any "ς" to "σ". I had included it to a > custom unaccent.rules file I was using and brought desired results. For > example searching for "Θωμάς" would not match "ΘΩΜΑΣ", unless such a > convertion exists. Not sure if that should be taken care of somewhere else, > but in my case (and also in the gist I sent you, check the last comments) it > proved useful and made sense. Hmm, I see. Also described here: https://en.wikipedia.org/wiki/Sigma I take it you are making searches case insensitive by converting everything to lower case. Since you have a distinction that exists in lower case but not in upper case, wouldn't it make more sense to converting everything to upper case? postgres=# select upper('Θωμάς'), upper('Θωμάσ'), upper('Θωμάσ') = upper('Θωμάς'); upper | upper | ?column? -------+-------+---------- ΘΩΜΆΣ | ΘΩΜΆΣ | t (1 row) PS On PostgreSQL mailing lists, we try to avoid "top posting" (= leaving the message we're replying to below our reply), because it makes the archive of email threads harder to read. -- Thomas Munro http://www.enterprisedb.com
В списке pgsql-bugs по дате отправления: