Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text
От | Marc Herbert |
---|---|
Тема | Re: Unicode is not UTF-8. was :psqlODBC-Driver Test / text |
Дата | |
Msg-id | khjsloyigj5.fsf@meije.emic.fr обсуждение исходный текст |
Ответ на | Re: psqlODBC-Driver Test / text fields ("Dave Page" <dpage@vale-housing.co.uk>) |
Список | pgsql-odbc |
Johann Zuschlag <zuschlag2@online.de> writes: > Hmm..., so Windows XP uses UCS-2 or do be more correct (like Bart > mentioned) UTF-16 (which is nearly the same, except for the > surrogates). It's nearly the same... but that makes a huge difference. The reason why you use fixed-character length encoding in memory is speed. This saves you a lot of time when computing string lengths, look for some characters (isalnum(),...), collating etc. If don't care about all this speed then you better stay in a variable-length encoding like UTF-8 which saves you A LOT of space, especially with small occidental alphabets. I think that by moving from UCS-2 to UTF-16 you lose on BOTH sides [insert some missing benchmarks here] And you can be sure that it brings a lot of bugs: one bug every time some string code has been "forgotten" and not updated, still assuming UCS-2. Anyway those bugs are only for far-away and unknown countries out of the BMP so who cares? :-/ So it really looks like a poor compatibility hack to me (java does it too).
В списке pgsql-odbc по дате отправления: