Re: Fixed length data types issue
От | mark@mark.mielke.cc |
---|---|
Тема | Re: Fixed length data types issue |
Дата | |
Msg-id | 20060908220730.GA19800@mark.mielke.cc обсуждение исходный текст |
Ответ на | Re: Fixed length data types issue (Andrew Dunstan <andrew@dunslane.net>) |
Список | pgsql-hackers |
On Fri, Sep 08, 2006 at 04:49:22PM -0400, Andrew Dunstan wrote: > mark@mark.mielke.cc wrote: > >Only ASCII values store more space efficiently in UTF-8. All values > >over 127 store more space efficiently using UTF-16. > This second statement is demonstrably not true. Only values above 0x07ff > require more than 2 bytes in UTF-8. All chars up to that point are > stored in UTF-8 with greater or equal efficiency than that of UTF-16. > See http://www.zvon.org/tmRFC/RFC2279/Output/chapter2.html You are correct - I should have said "All values over 127 store at least as space efficiently using UTF-16 as UTF-8." From the ICU page: "Most of the time, the memory throughput of the hard drive and RAM is the main performance constraint. UTF-8 is 50% smaller than UTF-16 or US-ASCII, but UTF-8 is 50% larger than UTF-16 or East and South Asian scripts. There is no memory difference for Latin extensions, Greek, Cyrillic, Hebrew, and Arabic. For processing Unicode data, UTF-16 is much easier to handle. You get a choice between either one or two units per character, not a choice among four lengths. UTF-16 also does not have illegal 16-bit unit values, while you might want to check or illegal bytes in UTF-8. Incomplete character sequences in UTF-16 are less important and more benign. If you want to quickly convert small strings between the different UTF encodings or get a UChar32 value, you can use the macros provided in utf.h and ..." I didn't think of the iterators for simple uses. Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bindthem... http://mark.mielke.cc/
В списке pgsql-hackers по дате отправления: