Обсуждение: BUG #15230: "Logical decoding" is not sensitive to client encodingsetting
BUG #15230: "Logical decoding" is not sensitive to client encodingsetting
От
PG Bug reporting form
Дата:
The following bug has been logged on the website:
Bug reference: 15230
Logged by: Hillel Eilat
Email address: hillel.eilat@attunity.com
PostgreSQL version: 9.4.4
Operating system: Windows 7
Description:
Logical Decoding is not sensitive to Client character encoding setting
My project uses Logical Decoding by interacting with the WAL backend via
native non-SQL interface.
The plugin used is the common "test_decoding", which is shipped together
with the kit.
There is a Japanese database for which encoding is defined as ""EUC_JP".
Ordinarily - we process the streamed data in UTF8 client encoding - thus
maintaining a common general "consumer" functions.
Consequently, prior to issuing PQconnectdbParams(keywords, values, true) - a
{"client_encoding","UTF8"} couple is introduced.
To be on the safe side - a couple of PQclientEncoding(pConn) /
pg_encoding_to_char(iClientEncoding) is issued thereafter,
for approving that UTF8 was properly set.
Despite the above setting , data which is streamed in does not show up in
UTF8.
It preserves the backend server EUC_JP encoding.
This must be a bug.
One would expect that decoded data which is treamed in should be subjected
to client encoding.
Your assistance will be appreciated.
Regards
Hillel.
Re: BUG #15230: "Logical decoding" is not sensitive to clientencoding setting
От
Euler Taveira
Дата:
2018-06-05 5:29 GMT-03:00 PG Bug reporting form <noreply@postgresql.org>:
> The plugin used is the common "test_decoding", which is shipped together
> with the kit.
>
What is the test_decoding output mode? By default, it uses textual
mode. Did you set binary mode (foce-binary=1)?
> There is a Japanese database for which encoding is defined as ""EUC_JP".
> Ordinarily - we process the streamed data in UTF8 client encoding - thus
> maintaining a common general "consumer" functions.
> Consequently, prior to issuing PQconnectdbParams(keywords, values, true) - a
> {"client_encoding","UTF8"} couple is introduced.
> To be on the safe side - a couple of PQclientEncoding(pConn) /
> pg_encoding_to_char(iClientEncoding) is issued thereafter,
> for approving that UTF8 was properly set.
>
client_encoding should be set in the replication connection because if
you set it later it won't be passed down to libpqwalreceiver.
[1] https://www.postgresql.org/docs/9.4/static/logicaldecoding-output-plugin.html#LOGICALDECODING-OUTPUT-MODE
--
Euler Taveira Timbira -
http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Thanks.
1. As per your question - default (=textual) decoding mode is used.
2. Factually - client_encoding is set in the replication connection.
The problem is that it does not help.
Data which is streamed in, is represented in the server_encoding (Japanese in this case) while we expect UTF8 -
whichwas set as client_encoding.
For being more specific - here is the essence of a piece of "C" code which is used for establishing the connection -
viaPQconnectdbParams(keywords, values, true);
This is the REPLICATION connection on which "START_REPLICATION SLOT "XXXXXXX" LOGICAL LLL/SSS" is executed later.
One would expect that data fetched in via PQgetCopyData(...) thereafter, will show up in client_encoding
representation.
But this is not the case...
Your clarifications will be appreciated.
Thanks
Hillel.
char *pszClientEncoding = "UTF8"; // Set client encoding
i = 0; // Initial Array index
keywords[i] = "dbname";
values[i] = pszDbName == NULL ? "replication" : pszDbName;
i++;
keywords[i] = "replication";
values[i] = pszDbName == NULL ? "true" : "database";
i++;
keywords[i] = "fallback_application_name";
values[i] = pszProgName;
i++;
if (pszDbHost)
{
keywords[i] = "host";
values[i] = pszDbHost;
i++;
}
if (pszDbUser)
{
keywords[i] = "user";
values[i] = pszDbUser;
i++;
}
if (pszDbPort)
{
keywords[i] = "port";
values[i] = pszDbPort;
i++;
}
if (pszClientEncoding) // Set client encoding
{
keywords[i] = "client_encoding";
values[i] = pszClientEncoding;
i++;
}
/* Prompting for password here is not a matter of interest (the -"W" connad option) */
//need_password = (dbgetpassword == 1 && dbpassword == NULL);
need_password = 0; // No point in this mechanism here
//do
{
if (pszDbPassword)
{
keywords[i] = "password";
values[i] = pszDecryptedPassword;
}
else
{
keywords[i] = NULL;
values[i] = NULL;
}
tmpconn = PQconnectdbParams(keywords, values, true);
if (!tmpconn)
{
pSetup->config.logger_error((char
*)pszLoggingOrg,__LINE__,kPG_LOGGER_SEVERITY_ERROR,"PQconnectdbParams(...)- Could not connect to the server.");
return NULL;
}
if (PQstatus(tmpconn) == CONNECTION_BAD && PQconnectionNeedsPassword(tmpconn) && dbgetpassword != -1)
{
AT_STR->snprintf(szMsg, sizeof(szMsg), "Could not connect to server. Missing or improper password:
%s",ar_PQerrorMessage(tmpconn));
pSetup->config.logger_error((char *)pszLoggingOrg,__LINE__,kPG_LOGGER_SEVERITY_ERROR,szMsg);
ar_PQfinish(tmpconn);
return NULL;
}
}
//while (need_password);
-----Original Message-----
From: Euler Taveira [mailto:euler@timbira.com.br]
Sent: Thursday, June 14, 2018 5:28 PM
To: Hillel Eilat <Hillel.Eilat@attunity.com>; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #15230: "Logical decoding" is not sensitive to client encoding setting
2018-06-05 5:29 GMT-03:00 PG Bug reporting form <noreply@postgresql.org>:
> The plugin used is the common "test_decoding", which is shipped
> together with the kit.
>
What is the test_decoding output mode? By default, it uses textual mode. Did you set binary mode (foce-binary=1)?
> There is a Japanese database for which encoding is defined as ""EUC_JP".
> Ordinarily - we process the streamed data in UTF8 client encoding -
> thus maintaining a common general "consumer" functions.
> Consequently, prior to issuing PQconnectdbParams(keywords, values,
> true) - a {"client_encoding","UTF8"} couple is introduced.
> To be on the safe side - a couple of PQclientEncoding(pConn) /
> pg_encoding_to_char(iClientEncoding) is issued thereafter, for
> approving that UTF8 was properly set.
>
client_encoding should be set in the replication connection because if you set it later it won't be passed down to
libpqwalreceiver.
[1]
https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.postgresql.org%2Fdocs%2F9.4%2Fstatic%2Flogicaldecoding-output-plugin.html%23LOGICALDECODING-OUTPUT-MODE&data=01%7C01%7Chillel.eilat%40attunity.com%7C9a1fc00d858f459156cc08d5d20313bc%7C128547273c574819ab290c418b8310a1%7C1&sdata=i4ViTGALzy04B%2F9GU4MToSVYJLCDxCxZahqChrax%2Bdk%3D&reserved=0
--
Euler Taveira Timbira -
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.timbira.com.br%2F&data=01%7C01%7Chillel.eilat%40attunity.com%7C9a1fc00d858f459156cc08d5d20313bc%7C128547273c574819ab290c418b8310a1%7C1&sdata=NOwGcjs2uIMGLCp6JaCjixKzL3mGDZVGxPJxo5m4UUo%3D&reserved=0
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento