Re: BUG #13440: unaccent does not remove all diacritics
От | Curd Reinert |
---|---|
Тема | Re: BUG #13440: unaccent does not remove all diacritics |
Дата | |
Msg-id | 55813376.2040204@gmx.de обсуждение исходный текст |
Ответ на | BUG #13440: unaccent does not remove all diacritics (mike@busbud.com) |
Список | pgsql-bugs |
Tom Lane <tgl@sss.pgh.pa.us> schrieb am 17.06.2015 00:01:48: > Also, while my German is nearly nonexistent, I had the idea that sharp-S > to "S" would be considered a case-folding transformation not an accent > removal. Comments from German speakers welcome of course. The sharp-s 'Ã' is historically a ligature of two different kinds of s, of which the first one looks more like an f and the second one looks either like a normal 's' or a 'z' (that's why it is called 'szlig' in html). It is usually considered to be a lower-case only character, event though an uppercase sharp-s has recently been defined. If you are using an encoding that doesn't support 'Ã', the rule is to substitute it with 'ss'. If you want to capitalize a word containing a 'Ã', you substitute it with 'SS'. For sorting purposes, DIN 5007 says that 'Ã' should be treated as 'ss'. That's just the German point of view. Thinks can be a little bit different in other german speaking countries, e.g. in Switzerland, where you may always substite 'Ã' with 'ss' (even if your encoding has an 'Ã'). In short: I would think that replacing 'Ã' with 's' is wrong, and certainly not an accent removal. Best regards, Curd
В списке pgsql-bugs по дате отправления: