Re: BUG #12845: The GB18030 encoding doesn't support Unicode characters over 0xFFFF
От | Heikki Linnakangas |
---|---|
Тема | Re: BUG #12845: The GB18030 encoding doesn't support Unicode characters over 0xFFFF |
Дата | |
Msg-id | 55430588.9060402@iki.fi обсуждение исходный текст |
Ответ на | Re: BUG #12845: The GB18030 encoding doesn't support Unicode characters over 0xFFFF (Bruce Momjian <bruce@momjian.us>) |
Ответы |
Re: BUG #12845: The GB18030 encoding doesn't support
Unicode characters over 0xFFFF
|
Список | pgsql-bugs |
On 04/30/2015 06:13 PM, Bruce Momjian wrote: > On Tue, Mar 10, 2015 at 11:33:43PM +0100, Heikki Linnakangas wrote: >>> I can write a "uint32 UTF8toGB18030(uint32)" function, but I don't know >>> where to put it in the code. >> >> The mapping functions are in src/backend/utils/mb/conversion_procs/utf8_and_gb18030/utf8_and_gb18030.c. >> They currently just consult the mapping table. You'd need to modify >> them to also check if the codepoint is in one of those linear >> ranges, and do the mapping for those programmatically. >> >>> Else I could also extend the map file. It would double in size if it only >>> needs to include valid code points. >> >> The current mapping table contains about 63000 mappings, but there >> are over a million valid code points that need to be mapped. If you >> just add every one-to-one mapping to the table, it's going to blow >> up in size to over 8 MB. I don't think we want that, handling the >> ranges with linear mappings programmatically makes a lot more sense. > > Should this be a TODO entry? Yeah, I guess it should. - Heikki
В списке pgsql-bugs по дате отправления: