Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings.
От | Jan Urbański |
---|---|
Тема | Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings. |
Дата | |
Msg-id | 4FF5FEC4.5090908@wulczer.org обсуждение исходный текст |
Ответ на | Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL encodings to Python encodings. (Heikki Linnakangas <hlinnaka@iki.fi>) |
Ответы |
Re: Re: [COMMITTERS] pgsql: Fix mapping of PostgreSQL
encodings to Python encodings.
|
Список | pgsql-hackers |
On 05/07/12 22:37, Heikki Linnakangas wrote: > On 05.07.2012 23:31, Tom Lane wrote: >> Heikki Linnakangas<heikki.linnakangas@iki.fi> writes: >>> Fix mapping of PostgreSQL encodings to Python encodings. >> >> The buildfarm doesn't like this --- did you check for side effects on >> regression test results? > > Hmm, I ran the regressions tests, but not with C encoding. With the > patch, you no longer get the errdetail you used to, when an encoding > conversion fails: > >> *************** >> *** 41,47 **** >> >> SELECT unicode_plan1(); >> ERROR: spiexceptions.InternalError: could not convert Python Unicode >> object to PostgreSQL server encoding >> - DETAIL: UnicodeEncodeError: 'ascii' codec can't encode character >> u'\x80' in position 0: ordinal not in range(128) >> CONTEXT: Traceback (most recent call last): >> PL/Python function "unicode_plan1", line 3, in <module> >> rv = plpy.execute(plan, [u"\x80"], 1) >> --- 39,44 ---- > > We could just update the expected output, there's two expected outputs > for this test case and one of them is now wrong. But it'd actually be > quite a shame to lose that extra information, that's quite valuable. > Perhaps we should go back to using PLu_elog() here, and find some other > way to avoid the recursion. Seems that the problem is that the LC_ALL=C makes Postgres use SQL_ASCII as the database encoding and as the comment states, translating PG's SQL_ASCII to Python's "ascii" is not ideal. The problem is that PLyUnicode_Bytes is (via an ifdef) used as PyString_ToString on Python3, which means that there are numerous call sites and new ones might appear in any moment. I'm not that keen on invoking the traceback machinery on low-level encoding errors. Hm, since PyUnicode_Bytes should get a unicode object and return bytes in the server encoding, we might just say that for SQL_ASCII we arbitrarily choose UTF-8 to encode the unicode codepoints, so we'd just set serverenc = "utf-8" in the first switch case. That doesn't solve the problem of the missing error detail, though. Jan
В списке pgsql-hackers по дате отправления: