Обсуждение: Re: [ADMIN] Encoding problems with migration from 8.0.14 to 8.3.0 on Windows
Meetesh Karia <meetesh.karia@gmail.com> writes: > Additionally, here's what I get when I run your test below (my server > encoding is UTF-8): > ltefull=# create table x (r varchar(255) unique); > NOTICE: CREATE TABLE / UNIQUE will create implicit index "x_r_key" for > table "x" > CREATE TABLE > ltefull=# > ltefull=# set client_encoding=WIN1250; > SET > ltefull=# insert into x (r) values ('Daniel Br�hl'); > INSERT 0 1 > ltefull=# > ltefull=# insert into x (r) values ('Daniel Bruehl'); > ERROR: duplicate key value violates unique constraint "x_r_key" You said this was on Windows, right? I was about to say "that should be impossible", until I looked at varstr_cmp() and realized that whoever put in the WIN32/UTF8 special case omitted this part: /* * In some locales strcoll() can claim that nonidentical strings are * equal. Believing that would be bad news for a number of reasons, * so we follow Perl's lead and sort "equal" strings according to * strcmp(). */ if (result == 0) result = strcmp(a1p, a2p); So we behave differently on Windows (with UTF8) than anywhere else. This is pretty nasty, not least because it means that texteq is inconsistent with other text comparison operators. I think this is a "must fix" bug for 8.3.1, anyone disagree? regards, tom lane
"Tom Lane" <tgl@sss.pgh.pa.us> writes: > I think this is a "must fix" bug for 8.3.1, anyone disagree? Agreed. It seems we should collect cases like this for the regression tests. The only one I was aware of previously was the Turkish one. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!
Thanks for the confirmation of this. For now I'll continue to use the C locale and I'll switch with 8.3.1.
Meetesh
Gregory Stark wrote:
Meetesh
Gregory Stark wrote:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:I think this is a "must fix" bug for 8.3.1, anyone disagree?Agreed. It seems we should collect cases like this for the regression tests. The only one I was aware of previously was the Turkish one.