Re: BUG #18362: unaccent rules and Old Greek text
От | Laurenz Albe |
---|---|
Тема | Re: BUG #18362: unaccent rules and Old Greek text |
Дата | |
Msg-id | 45150ad278a8bbb1b51ec02da991153bd2277e1f.camel@cybertec.at обсуждение исходный текст |
Ответ на | Re: BUG #18362: unaccent rules and Old Greek text (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-bugs |
On Tue, 2024-05-14 at 10:51 -0400, Robert Haas wrote: > The question of which mappings we actually ought to be adding seems > a lot harder, because it's not altogether clear what it means to > "remove an accent". The proposed patch adds a whole lot of rules that > turn tiny little characters into full-sized characters, boldfaced > and/or italicized and/or otherwise-fancily-printed characters into > full-sized characters. Only a handful of the changes are actually > adding rules that specifically *remove an accent*, but there are > similar rules that already exist, like turning ⅐ into the > four-character sequence " 1/7" and blocky-looking versions of each > letter into standard versions and ㍱ into the three-character sequence > "hPa". So my naive guess would be that we want all of these rules, > even though you would not guess from the unaccent documentation that > it's supposed to do stuff like this. But my knowledge of languages > other than English is very limited, and I am not a user of unaccent > and never have been, so I am reluctant to make grand pronouncements. > Does anyone more knowledgeable want to opine? I am not necessarily more knowledgeable, but I'll opine anyway. As a German speaker, I wouldn't call the dieresis on "ü" an accent like the French é, è or ê, even though the current implementation of unaccent() turns it into an "u". And while most people would agree that the caret on â is an accent in the French language, I am not sure if it is the same in Vietnamese. And I cannot see how ⅐ could be considered an accent... Perhaps if we invent a function called convert_to_ascii() or so instead of shoving that into unaccent(), it would make more sense. Yours, Laurenz Albe
В списке pgsql-bugs по дате отправления: