Re: TM format can mix encodings in to_char()

Поиск

Список

Период

Сортировка

От	Kyotaro HORIGUCHI
Тема	Re: TM format can mix encodings in to_char()
Дата	19 апреля 2019 г. 08:30:17
Msg-id	20190419.173017.204258244.horiguchi.kyotaro@lab.ntt.co.jp обсуждение исходный текст
Ответ на	TM format can mix encodings in to_char() (Juan José Santamaría Flecha <juanjo.santamaria@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

Hello.

At Fri, 12 Apr 2019 18:45:51 +0200, Juan José Santamaría Flecha <juanjo.santamaria@gmail.com> wrote in
<CAC+AXB22So5aZm2vZe+MChYXec7gWfr-n-SK-iO091R0P_1Tew@mail.gmail.com>
> Hackers,
>
> I will use as an example the code in the regression test
> 'collate.linux.utf8'.
> There you can find:
>
> SET lc_time TO 'tr_TR';
> SELECT to_char(date '2010-04-01', 'DD TMMON YYYY');
>    to_char
> -------------
>  01 NIS 2010
> (1 row)
>
> The problem is that the locale 'tr_TR' uses the encoding ISO-8859-9
> (LATIN5),
> while the test runs in UTF8. So the following code will raise an error:
>
> SET lc_time TO 'tr_TR';
> SELECT to_char(date '2010-02-01', 'DD TMMON YYYY');
> ERROR:  invalid byte sequence for encoding "UTF8": 0xde 0x75

The same case is handled for lc_numeric. lc_time ought to be
treated the same way.

> The problem seems to be in the code touched in the attached patch.

It seems basically correct, but cache_single_time does extra
strdup when pg_any_to_server did conversion. Maybe it would be
better be like this:

> oldcxt = MemoryContextSwitchTo(TopMemoryContext);
> ptr = pg_any_to_server(buf, strlen(buf), encoding);
>
> if (ptr == buf)
> {
>     /* Conversion didn't pstrdup, so we must */
>     ptr = pstrdup(buf);
> }
> MemoryContextSwitchTo(oldcxt);

-    int            i;
+    int            i,
+                encoding;

It is not strictly kept, but (I believe) we don't define multiple
variables in a single definition.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: TM format can mix encodings in to_char()