Microsoft harmful extensions to 8859-X charsets (was: Continuing encoding fun....)
От | Marc Herbert |
---|---|
Тема | Microsoft harmful extensions to 8859-X charsets (was: Continuing encoding fun....) |
Дата | |
Msg-id | 878xveyw4w.fsf@meije.emic.fr обсуждение исходный текст |
Ответ на | Re: Continuing encoding fun.... ("Dave Page" <dpage@vale-housing.co.uk>) |
Список | pgsql-odbc |
"Dave Page" <dpage@vale-housing.co.uk> writes: >> By the way 0x8A is not in the range of latin4 >> <http://czyborra.com/charsets/iso8859.html#ISO-8859-4> > > http://www.gar.no/home/mats/8859-4.htm says differently, however, I > can't claim to know enough about encoding issues to refute > either. I've been forced to learn what I can about the subject to help > maintain this driver and certainly may have got the wrong end of the > stick on one or more points! The page from gar.no is just a dump of the *Microsoft-extended* latin4 charset. The standards comittee carefully left a gap in all LATIN-X charsets between 0x80 and 0x9F, because those characters become (harmful) control characters once stripped of their 8th bit (by accident). You can see that very clearly in this table for instance <http://en.wikipedia.org/wiki/ISO_8859-4> If you follow the links from gar.no itself, you can land here: <http://en.wikipedia.org/wiki/ISO_8859> with tons of links (like the ECMA standards for instance) showing this gap. Microsoft, being Microsoft, jumped in that gap. Those non-standard Microsoft characters now plague the web as clearly explained here: <http://home.earthlink.net/~bobbau/platforms/specialchars/#windows> or here: <http://www.cs.tut.fi/~jkorpela/www/windows-chars.html>
В списке pgsql-odbc по дате отправления: