Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
От | Noah Misch |
---|---|
Тема | Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale. |
Дата | |
Msg-id | 20140921051846.GA1565935@tornado.leadboat.com обсуждение исходный текст |
Ответ на | Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale. (Alon <asimantov@tableausoftware.com>) |
Ответы |
Re: [BUGS] Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale.
Re: Re: BUG #11431: Failing to backup and restore a Windows postgres database, with Norwegian Bokmål locale. |
Список | pgsql-bugs |
On Fri, Sep 19, 2014 at 03:15:53PM -0700, Alon wrote: > The pg_dump file contains this command: > CREATE DATABASE workgroup WITH TEMPLATE = template0 ENCODING = 'UTF8' > LC_COLLATE = 'Norwegian (Bokmål)_Norway.1252' LC_CTYPE = 'Norwegian > (Bokmål)_Norway.1252'; > > The UTF16 encoding for ål) [a-ring l parenthesis] is > 00e5 006c 0029 > > In UTF8 this set of characters encoded as: > c3 a5 6c 29 > > The a-ring is converted to two bytes while the others are one. > > Based on the ERROR: > invalid byte sequence for encoding "UTF8": 0xe5 0x6c 0x29 > > It appears the set of characters is getting passed as: > e5 6c 29 > > In UTF8, e5 is always the start of a three byte character,possibly > pg_restore, ceratedb or else, tries to read these bytes as a single > character. > However, 6c and 29 can only be single byte characters, they can't be the > next two bytes in a three byte character. Hence the failure. > Seems like in the code, the 00xe5 is converted to e5 instead of 'c3 a5' when > passing the LC_COLLATE and LC_CTYPE values. In WIN1252, "e5 6c 29" is "ål)". We're likely failing to set client_encoding at some essential point in the process.
В списке pgsql-bugs по дате отправления: