Re: client_encoding issue with SQL_ASCII on 8.3 to 10 upgrade
От | Adrian Klaver |
---|---|
Тема | Re: client_encoding issue with SQL_ASCII on 8.3 to 10 upgrade |
Дата | |
Msg-id | d12e4b06-756e-5a61-f6d0-97d9cfeca991@aklaver.com обсуждение исходный текст |
Ответ на | client_encoding issue with SQL_ASCII on 8.3 to 10 upgrade (Keith Fiske <keith.fiske@crunchydata.com>) |
Список | pgsql-general |
On 04/16/2018 08:16 AM, Keith Fiske wrote: > Running into an issue with helping a client upgrade from 8.3 to 10 (yes, > I know, please keep the out of support comments to a minimum, thanks :). > > The old database was in SQL_ASCII and it needs to stay that way for now > unfortunately. The dump and restore itself works fine, but we're now > running into issues with some data returning encoding errors unless we > specifically set the client_encoding value to SQL_ASCII. > > Looking at the 8.3 database, it has the client_encoding value set to > UTF8 and queries seem to work fine. Is this just a bug in the old 8.3 > not enforcing encoding properly?e AFAIK, SQL_ASCII basically means no encoding: https://www.postgresql.org/docs/10/static/multibyte.html "The SQL_ASCII setting behaves considerably differently from the other settings. When the server character set is SQL_ASCII, the server interprets byte values 0-127 according to the ASCII standard, while byte values 128-255 are taken as uninterpreted characters. No encoding conversion will be done when the setting is SQL_ASCII. Thus, this setting is not so much a declaration that a specific encoding is in use, as a declaration of ignorance about the encoding. In most cases, if you are working with any non-ASCII data, it is unwise to use the SQL_ASCII setting because PostgreSQL will be unable to help you by converting or validating non-ASCII characters." What client are you working with? If psql then its behavior has changed between 8.3 and 10: https://www.postgresql.org/docs/10/static/release-9-1.html#id-1.11.6.121.3 " Have psql set the client encoding from the operating system locale by default (Heikki Linnakangas) This only happens if the PGCLIENTENCODING environment variable is not set. " https://www.postgresql.org/docs/10/static/app-psql.html "If both standard input and standard output are a terminal, then psql sets the client encoding to “auto”, which will detect the appropriate client encoding from the locale settings (LC_CTYPE environment variable on Unix systems). If this doesn't work out as expected, the client encoding can be overridden using the environment variable PGCLIENTENCODING." > > The other thing I noticed on the 10 instance was that, while the LOCALE > was set to SQL_ASCII, the COLLATE and CTYPE values for the restored > databases were en_US.UTF-8. Could this be having an affect? Is there any > way to see what these values were on the old 8.3 database? The > pg_database catalog does not have these values stored back then. > > -- > Keith Fiske > Senior Database Engineer > Crunchy Data - http://crunchydata.com -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-general по дате отправления: