A Patch for MIC to EUC_TW code converting in mb support
От | Chih-Chang Hsieh |
---|---|
Тема | A Patch for MIC to EUC_TW code converting in mb support |
Дата | |
Msg-id | 3A0A07FA.129ED448@cc.kmu.edu.tw обсуждение исходный текст |
Ответы |
Re: A Patch for MIC to EUC_TW code converting in mb
support
|
Список | pgsql-patches |
============================================================================ POSTGRESQL BUG REPORT: MIC to EUC_TW code converting in mb support ============================================================================ System Configuration --------------------- Architecture (example: Intel Pentium) :x86 Operating System (example: Linux 2.0.26 ELF) :Linux 2.2.x and FreeBSD 3.5R PostgreSQL version (example: PostgreSQL-7.0) :PostgreSQL-7.0.2 Compiler used (example: gcc 2.8.0) :egcs-2.91.66, gcc 2.7.3 A FULL description of the problem: ------------------------------------------------ In PostgreSQL mb (multi-byte) support, there is a bug in code converting for MIC to EUC_TW. Original mic2euc_tw() in conv.c converts CNS 11643-1992 Plane 2 into 2 bytes EUC_TW encoding. But characters in CNS 11643-1992 Plane 2 should be converted into 4 bytes EUC_TW encoding instead. A way to repeat the problem: ---------------------------------------------------------------------- When you initdb with -E EUC_TW and set PGCLIENTENCODING to BIG5, you will find all the characters in CNS 11643-1992 Plane 2 are incorrectly stored or output. This problem might be fixed by the solution in the attachement. *** conv.c Wed Nov 8 22:44:21 2000 --- conv.c.orig Sat May 20 21:12:26 2000 *************** *** 906,920 **** { len -= pg_mic_mblen(mic++); ! if (c1 == LC_CNS11643_1) { - *p++ = *mic++; - *p++ = *mic++; - } - else if (c1 == LC_CNS11643_2) - { - *p++ = SS2; - *p++ = 0xa2; *p++ = *mic++; *p++ = *mic++; } --- 906,913 ---- { len -= pg_mic_mblen(mic++); ! if (c1 == LC_CNS11643_1 || c1 == LC_CNS11643_2) { *p++ = *mic++; *p++ = *mic++; }
В списке pgsql-patches по дате отправления: