Re: pg_dump/restore encoding woes
От | Tom Lane |
---|---|
Тема | Re: pg_dump/restore encoding woes |
Дата | |
Msg-id | 66480.1377532742@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | pg_dump/restore encoding woes (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Ответы |
Re: pg_dump/restore encoding woes
|
Список | pgsql-hackers |
Heikki Linnakangas <hlinnakangas@vmware.com> writes: > When client encoding is not specified explicitly with the -E option, or > PGCLIENTENCODING env variable, the dump is created in the server encoding. Yeah, that's intentional as I recall. > However, pg_dump is special, because client encoding affects not only > the encoding used to speak to the server, but it also determines how the > resulting dump is encoded. If you have a UTF-8 server, and a LATIN1 > console, there is no way to get a UTF-8 encoded dump of a single table > which has non-ASCII characters in its name. There is a good reason to > want to dump in the server encoding regardless of the encoding of the > client: that avoids the costly encoding conversion during the dump, and > very likely another conversion back on restore. (as a convenience, it > would be nice if you could specify "-E server" to mean "same as server > encoding") There's a considerably more compelling reason than speed to default to avoiding a conversion: doing a conversion carries significant risk of outright failure, due to not being able to convert some data character to the client character set. > The pg_dump -E option just sets client_encoding, but I think it would be > better for -E to only set the encoding used in the dump, and > PGCLIENTENCODING env variable (if set) was used to determine the > encoding of the command-line arguments. Opinions? I think this is going to be a lot easier said than done, but feel free to see if you can make it work. (As you point out, we don't have any client-side encoding conversion infrastructure, but I don't see how you're going to make this work without it.) A second issue is whether we should divorce -E and PGCLIENTENCODING like that, when they have always meant the same thing. You mentioned the alternative of looking at pg_dump's locale environment to determine the command line encoding --- would that be better? regards, tom lane
В списке pgsql-hackers по дате отправления: