Re: BUG #13440: unaccent does not remove all diacritics
От | Peter Eisentraut |
---|---|
Тема | Re: BUG #13440: unaccent does not remove all diacritics |
Дата | |
Msg-id | 5589642C.3000201@gmx.net обсуждение исходный текст |
Ответ на | Re: BUG #13440: unaccent does not remove all diacritics (Alvaro Herrera <alvherre@2ndquadrant.com>) |
Список | pgsql-bugs |
On 6/18/15 5:17 PM, Alvaro Herrera wrote: > To me, conceptually what unaccent does is turn whatever junk you have > into a very basic common alphabet (ascii); then it's very easy to do > full text searches without having to worry about what accents the people > did or did not use in their searches. If we say "okay, but that funny > char is not an accent so let's leave it alone" then the charter doesn't > sound so useful to me. I think unaccent is one of those contrib things that are useful but not really fully thought out and therefore won't ever become an official core feature. It is what it is, and we can tweak it slightly, but thinking too hard about what it "should" do won't lead anywhere. If we wanted to do this "properly", we could do something like: perform Unicode canonical decomposition, then strip out all combining characters. I don't know how useful that is in practice, though. And it won't "solve" issues such as German Ã, which probably doesn't have a one-size-fits-all solution.
В списке pgsql-bugs по дате отправления: