Re: Fixed length data types issue
От | Alvaro Herrera |
---|---|
Тема | Re: Fixed length data types issue |
Дата | |
Msg-id | 20060908204209.GH5892@alvh.no-ip.org обсуждение исходный текст |
Ответ на | Re: Fixed length data types issue (mark@mark.mielke.cc) |
Ответы |
Re: Fixed length data types issue
|
Список | pgsql-hackers |
mark@mark.mielke.cc wrote: > On Fri, Sep 08, 2006 at 02:39:03PM -0400, Alvaro Herrera wrote: > > mark@mark.mielke.cc wrote: > > > I think I've been involved in a discussion like this in the past. Was > > > it mentioned in this list before? Yes the UTF-8 vs UTF-16 encoding > > > means that UTF-8 applications are at a disadvantage when using the > > > library. UTF-16 is considered more efficient to work with for everybody > > > except ASCII users. :-) > > Uh, is it? By whom? And why? > > The authors of the library in question? Java? Anybody whose primary > alphabet isn't LATIN1 based? :-) Well, for Latin-9 alphabets, Latin-9 is still more space-efficient than UTF-8. That covers a lot of the world. Forcing those people to change to UTF-16 does not strike me as a very good idea. But Martijn already clarified that ICU does not actually force you to switch everything to UTF-16, so this is not an issue anyway. > Only ASCII values store more space efficiently in UTF-8. All values > over 127 store more space efficiently using UTF-16. UTF-16 is easier > to process. UTF-8 requires too many bit checks with single character > offsets. I'm not an expert - I had this question before a year or two > ago, and read up on the ideas of experts. Well, I was not asking about "UTF-8 vs UTF-16," but rather "anything vs. UTF-16". I don't much like UTF-8 myself, but that's not a very informed opinion, just like a feeling of "fly-killing-cannon" (when it's used to store Latin-9-fitting text). -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
В списке pgsql-hackers по дате отправления: