[ psqlodbc-Bugs-1010313 ] ucs2_to_utf8 endianness problem
От | |
---|---|
Тема | [ psqlodbc-Bugs-1010313 ] ucs2_to_utf8 endianness problem |
Дата | |
Msg-id | 20080312164337.4F11E17AE666@pgfoundry.org обсуждение исходный текст |
Список | pgsql-odbc |
Bugs item #1010313, was opened at 2008-03-12 16:43 You can respond by visiting: http://pgfoundry.org/tracker/?func=detail&atid=538&aid=1010313&group_id=1000125 Category: None Group: None Status: Open Resolution: None Priority: 3 Submitted By: Ken Robbins (kpr) Assigned to: Nobody (None) Summary: ucs2_to_utf8 endianness problem Initial Comment: psqlodbc 08.02.0200 PostgreSQL 8.2.3 I'm using SQLBindParameter() (using SQL_WVARCHAR and SQL_C_WCHAR as the types) along with SQLExecute() to execute a preparedstatement. In the byte buffer, I'm using UCS-2 encoding. E.g., I'm doing something like this: wchar_t c = 0x03C0; // lower case greek letter pi char buf[255]; memcpy(buf, (char*) c, sizeof(wchar_t)); This eventually gets passed into ucs2_to_utf8(). char *ucs2_to_utf8(const SQLWCHAR *ucs2str, SQLLEN ilen, SQLLEN *olen, BOOL lower_identifier) Whenever, a UTF-8 byte sequence is more than one byte, a memcpy is used on either the UInt2 or Int4 types. E.g., memcpy(utf8str + len, (char *) &byte2code, sizeof(byte2code)); However, the ordering of the bytes of byte2code is different depending on whether the platform is little endian or big endian. When I run my code on an Intel environment (Linux), the code runs fine. However, when I run my code on a PowerPCenvironment (also Linux), the UTF-8 byte sequence is wrong. I added mylog() calls to the ucs2_to_utf8() code to see what bytes were at the memcpy step and also the final byte sequence. The bytes are correct; however, the ordering is flipped (understandably). For the 2 byte sequences, the orderingis just flipped. For the 3 byte sequences, in addition to the ordering being flipped, the wrong 3 bytes are beingused. Whenever I use SQL_VARCHAR and SQL_C_CHAR and put the UTF-8 byte sequence in the byte buffer myself, it works fine on bothplatforms. I believe that ucs2_to_utf8() needs to account for the endianness of the platform, so the right bytes are put in the finalreturned UTF-8 sequence. However, if I am not doing something right, please advise me on that also. ---------------------------------------------------------------------- You can respond by visiting: http://pgfoundry.org/tracker/?func=detail&atid=538&aid=1010313&group_id=1000125
В списке pgsql-odbc по дате отправления: