Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8
От | Kyotaro Horiguchi |
---|---|
Тема | Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8 |
Дата | |
Msg-id | 20201030.122851.538415294986124838.horikyota.ntt@gmail.com обсуждение исходный текст |
Ответ на | Re: MINUS SIGN (U+2212) in EUC-JP encoding is mapped to FULLWIDTH HYPHEN-MINUS (U+FF0D) in UTF-8 (Amit Langote <amitlangote09@gmail.com>) |
Список | pgsql-hackers |
At Fri, 30 Oct 2020 12:08:51 +0900, Amit Langote <amitlangote09@gmail.com> wrote in > I noticed that the commit a8bd7e1c6e02 from ages ago removed > conversions from and to utf-8's e28892, in favor of efbc8d, and that > change has stuck. (Note though that these maps looked pretty > different back then.) > > --- a/src/backend/utils/mb/Unicode/euc_jp_to_utf8.map > +++ b/src/backend/utils/mb/Unicode/euc_jp_to_utf8.map > - {0xa1dd, 0xe28892}, > + {0xa1dd, 0xefbc8d}, > > --- a/src/backend/utils/mb/Unicode/utf8_to_euc_jp.map > +++ b/src/backend/utils/mb/Unicode/utf8_to_euc_jp.map > - {0xe28892, 0xa1dd}, > + {0xefbc8d, 0xa1dd}, > > Can't tell what reason there was to do that, but there must have been > some. Maybe the Japanese character sets prefer full-width hyphen > minus (unicode U+FF0D) over mathematical minus sign (U+2212)? It's a decsion made by Microsoft. Several other characters are in similar issues. I remember many people complained but in the end that wasn't "fixed" and led to the well-known conversion messes of Japanese character conversion involving Unicode in Java. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
В списке pgsql-hackers по дате отправления: