Re: The "char" type versus non-ASCII characters

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: The "char" type versus non-ASCII characters
Дата	3 декабря 2021 г. 19:42:11
Msg-id	2320640.1638560531@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: The "char" type versus non-ASCII characters (Andrew Dunstan <andrew@dunslane.net>)
Ответы	Re: The "char" type versus non-ASCII characters
Список	pgsql-hackers

Дерево обсуждения

Andrew Dunstan <andrew@dunslane.net> writes:
> On 12/3/21 14:12, Tom Lane wrote:
>> I can think of at least three ways we might address this:
>> 
>> * Forbid all non-ASCII values for type "char".  This results in
>> simple and portable semantics, but it might break usages that
>> work okay today.
>> 
>> * Allow such values only in single-byte server encodings.  This
>> is a bit messy, but it wouldn't break any cases that are not
>> problematic already.
>> 
>> * Continue to allow non-ASCII values, but change charin/charout,
>> char_text, etc so that the external representation is encoding-safe
>> (perhaps make it an octal or decimal number).

> I don't like #2.

Yeah, it's definitely messy --- for example, maybe é works in
a latin1 database but is rejected when you try to restore into
a DB with utf8 encoding.

> Is #3 going to change the external representation only
> for non-ASCII values? If so, that seems OK.

Right, I envisioned that ASCII behaves the same but we'd use
a numeric representation for high-bit-set values.  These
cases could be told apart fairly easily by charin(), since
the numeric representation would always be three digits.

> #1 is the simplest to implement and to understand,
> and I suspect it would break very little in practice, but others might
> disagree with that assessment.

We'd still have to decide what to do with pg_upgrade'd
non-ASCII values, so there's messiness there too.
Having charout() throw an error seems not very nice.

            regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: The "char" type versus non-ASCII characters