Re: UTF8 national character data type support WIP patch and list of open issues.
От | MauMau |
---|---|
Тема | Re: UTF8 national character data type support WIP patch and list of open issues. |
Дата | |
Msg-id | 673E261C589440E3B0D8FDF9A11B1181@maumau обсуждение исходный текст |
Ответ на | Re: UTF8 national character data type support WIP patch and list of open issues. (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: UTF8 national character data type support WIP patch
and list of open issues.
|
Список | pgsql-hackers |
From: "Robert Haas" <robertmhaas@gmail.com> > On Tue, Nov 5, 2013 at 5:15 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> On 11/5/13, 1:04 AM, Arulappan, Arul Shaji wrote: >>> Implements NCHAR/NVARCHAR as distinct data types, not as synonyms >> >> If, per SQL standard, NCHAR(x) is equivalent to CHAR(x) CHARACTER SET >> "cs", then for some "cs", NCHAR(x) must be the same as CHAR(x). >> Therefore, an implementation as separate data types is wrong. > > Since the point doesn't seem to be getting through, let me try to be > more clear: we're not going to accept any form of this patch. A patch > that makes some progress toward actually coping with multiple > encodings in the same database would be very much worth considering, > but adding compatible syntax with incompatible semantics is not of > interest to the PostgreSQL project. We have had this debate on many > other topics in the past and will no doubt have it again in the > future, but the outcome is always the same. It doesn't seem that there is any semantics incompatible with the SQL standard as follows: - In the first step, "cs" is the database encoding, which is used for char/varchar/text. - In the second (or final) step, where multiple encodings per database is supported, "cs" is the national character encoding which is specified with CREATE DATABASE ... NATIONAL CHARACTER ENCODING cs. If NATIONAL CHARACTER ENCODING clause is omitted, "cs" is the database encoding as step 1. Let me repeat myself: I think the biggest and immediate issue is that PostgreSQL does not support national character types at least officially. "Officially" means the description in the manual. So I don't have strong objection against the current (hidden) implementation of nchar types in PostgreSQL which are just synonyms, as long as the official support is documented. Serious users don't want to depend on hidden features. However, doesn't the current synonym approach have any problems? Wouldn't it produce any trouble in the future? If we treat nchar as char, we lose the fact that the user requested nchar. Can we lose the fact so easily and produce irreversible result as below? -------------------------------------------------- Maybe so. I guess the distinct type for NCHAR is for future extension and user friendliness. As one user, I expect to get "national character" instead of "char character set xxx" as output of psql \d and pg_dump when I specified "national character" in DDL. In addition, that makes it easy to use the pg_dump output for importing data to other DBMSs for some reason. -------------------------------------------------- Regards MauMau
В списке pgsql-hackers по дате отправления: