Обсуждение: BUG #3932: utf-8 and upper()/lower(): PANIC: ERRORDATA_STACK_SIZE exceeded
BUG #3932: utf-8 and upper()/lower(): PANIC: ERRORDATA_STACK_SIZE exceeded
От
"Florian Wunderlich"
Дата:
The following bug has been logged online: Bug reference: 3932 Logged by: Florian Wunderlich Email address: fwunderlich@factor3.de PostgreSQL version: 8.2.6 Operating system: Debian unstable Description: utf-8 and upper()/lower(): PANIC: ERRORDATA_STACK_SIZE exceeded Details: - input file in encoding iso-8859-1: set client_encoding='iso-8859-1'; select upper('ä'), lower('Ã'); (note: the argument to upper is a lower case a umlaut, while the argument to lower is an upper case a umlaut) - database "iso" with encoding iso-8859-1, database "utf" with encoding utf-8, both in a cluster with locale=de_DE The command psql iso < input yields the correct output (upper case a umlaut, lower case a umlaut). The command psql utf < input yields PANIK: ERRORDATA_STACK_SIZE exceeded. server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. connection to server was lost The log shows: ERROR: invalid byte sequence for encoding "UTF8": 0xe384 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". then the same error four times but with 0xfc. Doing the exact same thing with an input file with encoding utf-8 (with client_encoding replaced accordingly) again works fine with the iso database, but yields a lower case a umlaut for upper() and nothing for the lower() function for the utf database. Thus, it would seem that the upper() and lower() functions do not work at all for databases with encoding utf-8 and non-US-ASCII input.
Re: BUG #3932: utf-8 and upper()/lower(): PANIC: ERRORDATA_STACK_SIZE exceeded
От
Alvaro Herrera
Дата:
Florian Wunderlich wrote: > - input file in encoding iso-8859-1: > > set client_encoding='iso-8859-1'; > select upper('ä'), lower('Ä'); > > (note: the argument to upper is a lower case a umlaut, while the argument to > lower is an upper case a umlaut) > > - database "iso" with encoding iso-8859-1, > database "utf" with encoding utf-8, > both in a cluster with locale=de_DE I think this is just a case of a misconfigured server. If you choose locale de_DE, which supports only the iso-8859-1 encoding, it is an error to build a database with utf8 encoding -- which is why 8.3 rejects that combination. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Re: BUG #3932: utf-8 and upper()/lower(): PANIC: ERRORDATA_STACK_SIZE exceeded
От
Florian Wunderlich
Дата:
Alvaro Herrera wrote: > Florian Wunderlich wrote: > >> - input file in encoding iso-8859-1: >> >> set client_encoding='iso-8859-1'; >> select upper('ä'), lower('Ä'); >> >> (note: the argument to upper is a lower case a umlaut, while the argument to >> lower is an upper case a umlaut) >> >> - database "iso" with encoding iso-8859-1, >> database "utf" with encoding utf-8, >> both in a cluster with locale=de_DE > > I think this is just a case of a misconfigured server. If you choose > locale de_DE, which supports only the iso-8859-1 encoding, it is an > error to build a database with utf8 encoding -- which is why 8.3 rejects > that combination. > You are correct; if I use de_DE.UTF-8 for initdb, the database with encoding utf-8 works fine (and the database with iso-8859-1 doesn't). Because such an invalid combination cannot happen for 8.3 anymore, the PANIC cannot occur anymore, and thus the bug can be closed.