Обсуждение: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

Поиск
Список
Период
Сортировка

Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
Nanina Tron
Дата:

Hi,

I am pretty new to PostgreSQL so I might just miss something basic here.

My problem is that, I cannot import or export some of the tables in my db with pgAdmin4, as it raises the “ERROR: unvalid byte-sequenz for coding »UTF8«: 0xdf 0x67“”. The table was originally created with Excel and imported via pgAdmin3. The strange thing is that it can still be imported and exported with pgAdmin3 but not with pgAdmin4. The db was created with encoding UTF-8, the .csv files where created with encoding UTF-8 and also the import/export dialog is set to UTF-8. Queries are also no problem on these tables so it seems to me that this could be a client problem.

I am running PostgreSQL 11.1 on a server (I don’t know the OS, maintained with pgAdmin4). Locally I am working on a Windows 7 Professional (Service Pack 1) 64 Bit-System and pgAdmin4  3.6 & pgAdmin3.

I did not find any hint of the same problem on my Google or archive search, so I would be very grateful for any idea what I am doing wrong here.

Best,

Nanina

Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
richard coleman
Дата:
Nania, 

Welcome to the wonderful world of pgAdmin4.  I have been bitten often by this particular shortcoming in pgAdmin4. :(  My issue seems to stems from the fact that I use ASCII tables as a back end for a .Net windows application and perfectly valid windows (Word/Excel) characters cause pgAdmin4 no end of issues.

My solution (with the help of some fine people on the postgres IRC channel) is to run a couple of functions on my tables/fields to locate and clean the offending characters out.  Of course, if you need those characters, then this won't actually help.  Here they are in the advent that they might prove helpful/adaptable to your situation.

Finds what pgAdmin4 considers bad UTF8:
CREATE OR REPLACE FUNCTION live.is_utf8(
text)
    RETURNS boolean
    LANGUAGE 'sql'

    COST 100
    VOLATILE 
AS $BODY$
    select encode(convert_to($1,'SQL_ASCII'),'hex')
           ~ $r$(?x)
                ^(?:(?:[0-7][0-9a-f])
                   |(?:(?:c[2-9a-f]|d[0-9a-f])
                      |e0[ab][0-9a-f]
                      |ed[89][0-9a-f]
                      |(?:(?:e[1-9abcef])
                         |f0[9ab][0-9a-f]
                         |f[1-3][89ab][0-9a-f]
                         |f48[0-9a-f]
                       )[89ab][0-9a-f]
                    )[89ab][0-9a-f]
                 )*$
             $r$;
$BODY$;

ALTER FUNCTION live.is_utf8(text)
    OWNER TO postgres;


Fixes what pgAdmin4 considers to be bad UTF8:
CREATE OR REPLACE FUNCTION live.badutf8(
text)
    RETURNS text
    LANGUAGE 'sql'
    COST 100
    VOLATILE 
AS $BODY$
    select regexp_replace(encode(convert_to($1,'SQL_ASCII'),'hex'),
             $r$(?x)
                 (?:(?:[0-7][0-9a-f])
                   |(?:(?:c[2-9a-f]|d[0-9a-f])
                      |e0[ab][0-9a-f]
                      |ed[89][0-9a-f]
                      |(?:(?:e[1-9abcef])
                         |f0[9ab][0-9a-f]
                         |f[1-3][89ab][0-9a-f]
                         |f48[0-9a-f]
                       )[89ab][0-9a-f]
                    )[89ab][0-9a-f]
                 )*(..)?
             $r$, '-\1-', 'g')
$BODY$;
ALTER FUNCTION live.badutf8(text)
    OWNER TO postgres;

Fixes bad UTF8

   

On Mon, Jan 7, 2019 at 8:40 AM Nanina Tron <nanina.tron@icloud.com> wrote:

Hi,

I am pretty new to PostgreSQL so I might just miss something basic here.

My problem is that, I cannot import or export some of the tables in my db with pgAdmin4, as it raises the “ERROR: unvalid byte-sequenz for coding »UTF8«: 0xdf 0x67“”. The table was originally created with Excel and imported via pgAdmin3. The strange thing is that it can still be imported and exported with pgAdmin3 but not with pgAdmin4. The db was created with encoding UTF-8, the .csv files where created with encoding UTF-8 and also the import/export dialog is set to UTF-8. Queries are also no problem on these tables so it seems to me that this could be a client problem.

I am running PostgreSQL 11.1 on a server (I don’t know the OS, maintained with pgAdmin4). Locally I am working on a Windows 7 Professional (Service Pack 1) 64 Bit-System and pgAdmin4  3.6 & pgAdmin3.

I did not find any hint of the same problem on my Google or archive search, so I would be very grateful for any idea what I am doing wrong here.

Best,

Nanina

Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
Dave Page
Дата:
On Mon, Jan 7, 2019 at 9:11 PM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Nania,
>
> Welcome to the wonderful world of pgAdmin4.  I have been bitten often by this particular shortcoming in pgAdmin4. :(
Myissue seems to stems from the fact that I use ASCII tables as a back end for a .Net windows application and perfectly
validwindows (Word/Excel) characters cause pgAdmin4 no end of issues. 

pgAdmin has nothing to do with this. It is simply calling PostgreSQL's
psql utility, and telling it to import or export the file. The
database server is then throwing the error seen.

> My solution (with the help of some fine people on the postgres IRC channel) is to run a couple of functions on my
tables/fieldsto locate and clean the offending characters out.  Of course, if you need those characters, then this
won'tactually help.  Here they are in the advent that they might prove helpful/adaptable to your situation. 

The problem with that is that you're trying to fix something that's
basically broken to begin with. From the PostgreSQL docs
(https://www.postgresql.org/docs/current/multibyte.html):

----
The SQL_ASCII setting behaves considerably differently from the other
settings. When the server character set is SQL_ASCII, the server
interprets byte values 0-127 according to the ASCII standard, while
byte values 128-255 are taken as uninterpreted characters. No encoding
conversion will be done when the setting is SQL_ASCII. Thus, this
setting is not so much a declaration that a specific encoding is in
use, as a declaration of ignorance about the encoding. In most cases,
if you are working with any non-ASCII data, it is unwise to use the
SQL_ASCII setting because PostgreSQL will be unable to help you by
converting or validating non-ASCII characters.
----


> Finds what pgAdmin4 considers bad UTF8:
> CREATE OR REPLACE FUNCTION live.is_utf8(
> text)
>     RETURNS boolean
>     LANGUAGE 'sql'
>
>     COST 100
>     VOLATILE
> AS $BODY$
>     select encode(convert_to($1,'SQL_ASCII'),'hex')
>            ~ $r$(?x)
>                 ^(?:(?:[0-7][0-9a-f])
>                    |(?:(?:c[2-9a-f]|d[0-9a-f])
>                       |e0[ab][0-9a-f]
>                       |ed[89][0-9a-f]
>                       |(?:(?:e[1-9abcef])
>                          |f0[9ab][0-9a-f]
>                          |f[1-3][89ab][0-9a-f]
>                          |f48[0-9a-f]
>                        )[89ab][0-9a-f]
>                     )[89ab][0-9a-f]
>                  )*$
>              $r$;
> $BODY$;
>
> ALTER FUNCTION live.is_utf8(text)
>     OWNER TO postgres;
>
>
> Fixes what pgAdmin4 considers to be bad UTF8:
>>
>> CREATE OR REPLACE FUNCTION live.badutf8(
>> text)
>>     RETURNS text
>>     LANGUAGE 'sql'
>>     COST 100
>>     VOLATILE
>> AS $BODY$
>>     select regexp_replace(encode(convert_to($1,'SQL_ASCII'),'hex'),
>>              $r$(?x)
>>                  (?:(?:[0-7][0-9a-f])
>>                    |(?:(?:c[2-9a-f]|d[0-9a-f])
>>                       |e0[ab][0-9a-f]
>>                       |ed[89][0-9a-f]
>>                       |(?:(?:e[1-9abcef])
>>                          |f0[9ab][0-9a-f]
>>                          |f[1-3][89ab][0-9a-f]
>>                          |f48[0-9a-f]
>>                        )[89ab][0-9a-f]
>>                     )[89ab][0-9a-f]
>>                  )*(..)?
>>              $r$, '-\1-', 'g')
>> $BODY$;
>> ALTER FUNCTION live.badutf8(text)
>>     OWNER TO postgres;
>
>
> Fixes bad UTF8
>
>
>
> On Mon, Jan 7, 2019 at 8:40 AM Nanina Tron <nanina.tron@icloud.com> wrote:
>>
>> Hi,
>>
>> I am pretty new to PostgreSQL so I might just miss something basic here.
>>
>> My problem is that, I cannot import or export some of the tables in my db with pgAdmin4, as it raises the “ERROR:
unvalidbyte-sequenz for coding »UTF8«: 0xdf 0x67“”. The table was originally created with Excel and imported via
pgAdmin3.The strange thing is that it can still be imported and exported with pgAdmin3 but not with pgAdmin4. The db
wascreated with encoding UTF-8, the .csv files where created with encoding UTF-8 and also the import/export dialog is
setto UTF-8. Queries are also no problem on these tables so it seems to me that this could be a client problem. 
>>
>> I am running PostgreSQL 11.1 on a server (I don’t know the OS, maintained with pgAdmin4). Locally I am working on a
Windows7 Professional (Service Pack 1) 64 Bit-System and pgAdmin4  3.6 & pgAdmin3. 
>>
>> I did not find any hint of the same problem on my Google or archive search, so I would be very grateful for any idea
whatI am doing wrong here. 
>>
>> Best,
>>
>> Nanina



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
richard coleman
Дата:
Dave, 

I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 specific problem, at least in so far as SQL_ASCII is concerned.  I say this because I can usually work with the data just fine from the psql prompt, but not through pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection).  .Net/Windows ODBC drivers and psql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond select/update/insert).  pgAdmin4, SELECT, export, etc. BOOM! At least until you cleaned  up the offending bytes.

Just my $0.02.

rik.





On Mon, Jan 7, 2019 at 12:49 PM Dave Page <dpage@pgadmin.org> wrote:
On Mon, Jan 7, 2019 at 9:11 PM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Nania,
>
> Welcome to the wonderful world of pgAdmin4.  I have been bitten often by this particular shortcoming in pgAdmin4. :(  My issue seems to stems from the fact that I use ASCII tables as a back end for a .Net windows application and perfectly valid windows (Word/Excel) characters cause pgAdmin4 no end of issues.

pgAdmin has nothing to do with this. It is simply calling PostgreSQL's
psql utility, and telling it to import or export the file. The
database server is then throwing the error seen.

> My solution (with the help of some fine people on the postgres IRC channel) is to run a couple of functions on my tables/fields to locate and clean the offending characters out.  Of course, if you need those characters, then this won't actually help.  Here they are in the advent that they might prove helpful/adaptable to your situation.

The problem with that is that you're trying to fix something that's
basically broken to begin with. From the PostgreSQL docs
(https://www.postgresql.org/docs/current/multibyte.html):

----
The SQL_ASCII setting behaves considerably differently from the other
settings. When the server character set is SQL_ASCII, the server
interprets byte values 0-127 according to the ASCII standard, while
byte values 128-255 are taken as uninterpreted characters. No encoding
conversion will be done when the setting is SQL_ASCII. Thus, this
setting is not so much a declaration that a specific encoding is in
use, as a declaration of ignorance about the encoding. In most cases,
if you are working with any non-ASCII data, it is unwise to use the
SQL_ASCII setting because PostgreSQL will be unable to help you by
converting or validating non-ASCII characters.
----


> Finds what pgAdmin4 considers bad UTF8:
> CREATE OR REPLACE FUNCTION live.is_utf8(
> text)
>     RETURNS boolean
>     LANGUAGE 'sql'
>
>     COST 100
>     VOLATILE
> AS $BODY$
>     select encode(convert_to($1,'SQL_ASCII'),'hex')
>            ~ $r$(?x)
>                 ^(?:(?:[0-7][0-9a-f])
>                    |(?:(?:c[2-9a-f]|d[0-9a-f])
>                       |e0[ab][0-9a-f]
>                       |ed[89][0-9a-f]
>                       |(?:(?:e[1-9abcef])
>                          |f0[9ab][0-9a-f]
>                          |f[1-3][89ab][0-9a-f]
>                          |f48[0-9a-f]
>                        )[89ab][0-9a-f]
>                     )[89ab][0-9a-f]
>                  )*$
>              $r$;
> $BODY$;
>
> ALTER FUNCTION live.is_utf8(text)
>     OWNER TO postgres;
>
>
> Fixes what pgAdmin4 considers to be bad UTF8:
>>
>> CREATE OR REPLACE FUNCTION live.badutf8(
>> text)
>>     RETURNS text
>>     LANGUAGE 'sql'
>>     COST 100
>>     VOLATILE
>> AS $BODY$
>>     select regexp_replace(encode(convert_to($1,'SQL_ASCII'),'hex'),
>>              $r$(?x)
>>                  (?:(?:[0-7][0-9a-f])
>>                    |(?:(?:c[2-9a-f]|d[0-9a-f])
>>                       |e0[ab][0-9a-f]
>>                       |ed[89][0-9a-f]
>>                       |(?:(?:e[1-9abcef])
>>                          |f0[9ab][0-9a-f]
>>                          |f[1-3][89ab][0-9a-f]
>>                          |f48[0-9a-f]
>>                        )[89ab][0-9a-f]
>>                     )[89ab][0-9a-f]
>>                  )*(..)?
>>              $r$, '-\1-', 'g')
>> $BODY$;
>> ALTER FUNCTION live.badutf8(text)
>>     OWNER TO postgres;
>
>
> Fixes bad UTF8
>
>
>
> On Mon, Jan 7, 2019 at 8:40 AM Nanina Tron <nanina.tron@icloud.com> wrote:
>>
>> Hi,
>>
>> I am pretty new to PostgreSQL so I might just miss something basic here.
>>
>> My problem is that, I cannot import or export some of the tables in my db with pgAdmin4, as it raises the “ERROR: unvalid byte-sequenz for coding »UTF8«: 0xdf 0x67“”. The table was originally created with Excel and imported via pgAdmin3. The strange thing is that it can still be imported and exported with pgAdmin3 but not with pgAdmin4. The db was created with encoding UTF-8, the .csv files where created with encoding UTF-8 and also the import/export dialog is set to UTF-8. Queries are also no problem on these tables so it seems to me that this could be a client problem.
>>
>> I am running PostgreSQL 11.1 on a server (I don’t know the OS, maintained with pgAdmin4). Locally I am working on a Windows 7 Professional (Service Pack 1) 64 Bit-System and pgAdmin4  3.6 & pgAdmin3.
>>
>> I did not find any hint of the same problem on my Google or archive search, so I would be very grateful for any idea what I am doing wrong here.
>>
>> Best,
>>
>> Nanina



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
Dave Page
Дата:
Hi

On Mon, Jan 7, 2019 at 11:30 PM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 specific problem, at least in so far as
SQL_ASCIIis concerned.  I say this because I can usually work with the data just fine from the psql prompt, but not
throughpgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection).  .Net/Windows ODBC drivers
andpsql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond
select/update/insert). pgAdmin4, SELECT, export, etc. BOOM! At least until you cleaned  up the offending bytes. 
>
> Just my $0.02.

I'm afraid the fundamental problem is that you're using PostgreSQL in
a way that the docs specifically recommend against doing, and you're
seeing the reason why.

pgAdmin 3 and 4 are completely different. In the import/export utility
that Nania reported the issue in, pgAdmin doesn't look at the data *at
all*. It simply executes \copy in psql, which does all the work. All
pgAdmin does is provide connection info and options to psql, based on
the selections made in the import/export dialogue, and executes it.

In other areas of pgAdmin, like the query tool, it is possible to see
similar issues with the same underlying cause, though we've spent a
significant amount of time trying to work around all the possible edge
cases.

pgAdmin 3 implemented import/export itself, using underlying libraries
that were far less strict about encoding rules than Python is. That
may have been more convenient for this particular issue, but it's a
lot worse in many others.

As a general thought (and do bear in mind, we've spent significant
time and resources on these issues in the past), I'd far rather spend
time on new features and actual bugs, than further time on workarounds
for things the PostgreSQL docs specifically advise against doing.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
richard coleman
Дата:
Dave, 

Thanks for taking the time to respond, but I don't see anywhere that SQL_ASCII is recommended against doing. Here's the documentation listing the supported encoding schemas: https://www.postgresql.org/docs/current/multibyte.html

The only caveats listed for SQL_ASCII are:
In most cases, if you are working with any non-ASCII data, it is unwise to use the SQL_ASCII setting because PostgreSQL will be unable to help you by converting or validating non-ASCII characters.

Or, a reminder that postgreSQL can't help with any conversions you might want to do.  

Then there's this:
PostgreSQL will allow superusers to create databases with SQL_ASCII encoding even when LC_CTYPE is not C or POSIX. As noted above, SQL_ASCII does not enforce that the data stored in the database has any particular encoding, and so this choice poses risks of locale-dependent misbehavior. Using this combination of settings is deprecated and may someday be forbidden altogether.  

A note that you can currently choose incompatible settings, but probably can't in the future.

And finally there's this bit of advice:
If the client character set is defined as SQL_ASCII, encoding conversion is disabled, regardless of the server's character set. Just as for the server, use of SQL_ASCII is unwise unless you are working with all-ASCII data[emphasis mine].  

Which is just a reiteration of the first caveat, that if you are using SQL_ASCII the database won't perform any conversions on your behalf.

That is hardly a recommendation against using that supported encoding scheme.  The fact that the psql command prompt, among others, works with it without issue, is an indication that the problem lies in pgAdmin4 (and I would guess the reliance of python on UTF8) than an issue with the database itself.  pgAdmin4 needs to check for and more gracefully handle valid postgreSQL data that might happen to be not UTF8 compliant.

Until then, I will have to periodically scan and clean for bad UTF8 data to keep pgAdmin4 (and other JDBC dependent code) happy.  The legacy enterprise .Net applications that depend on it prohibit converting it to UTF8 (or anything else for that matter).

Just my $0.02,

rik. 
 

On Mon, Jan 7, 2019 at 1:27 PM Dave Page <dpage@pgadmin.org> wrote:
Hi

On Mon, Jan 7, 2019 at 11:30 PM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 specific problem, at least in so far as SQL_ASCII is concerned.  I say this because I can usually work with the data just fine from the psql prompt, but not through pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection).  .Net/Windows ODBC drivers and psql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond select/update/insert).  pgAdmin4, SELECT, export, etc. BOOM! At least until you cleaned  up the offending bytes.
>
> Just my $0.02.

I'm afraid the fundamental problem is that you're using PostgreSQL in
a way that the docs specifically recommend against doing, and you're
seeing the reason why.

pgAdmin 3 and 4 are completely different. In the import/export utility
that Nania reported the issue in, pgAdmin doesn't look at the data *at
all*. It simply executes \copy in psql, which does all the work. All
pgAdmin does is provide connection info and options to psql, based on
the selections made in the import/export dialogue, and executes it.

In other areas of pgAdmin, like the query tool, it is possible to see
similar issues with the same underlying cause, though we've spent a
significant amount of time trying to work around all the possible edge
cases.

pgAdmin 3 implemented import/export itself, using underlying libraries
that were far less strict about encoding rules than Python is. That
may have been more convenient for this particular issue, but it's a
lot worse in many others.

As a general thought (and do bear in mind, we've spent significant
time and resources on these issues in the past), I'd far rather spend
time on new features and actual bugs, than further time on workarounds
for things the PostgreSQL docs specifically advise against doing.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
Dave Page
Дата:
Hi

On Tue, Jan 8, 2019 at 12:47 AM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> Thanks for taking the time to respond, but I don't see anywhere that SQL_ASCII is recommended against doing. Here's
thedocumentation listing the supported encoding schemas: https://www.postgresql.org/docs/current/multibyte.html . 
>
> The only caveats listed for SQL_ASCII are:
>>
>> In most cases, if you are working with any non-ASCII data, it is unwise to use the SQL_ASCII setting because
PostgreSQLwill be unable to help you by converting or validating non-ASCII characters. 

You highlighted it below: "If the client character set is defined as
SQL_ASCII, encoding conversion is disabled, regardless of the server's
character set. Just as for the server, use of SQL_ASCII is unwise
unless you are working with all-ASCII data"

You're using UTF-8 data, not ASCII, which it says is unwise because
conversion won't take place (and consequently, neither will
validation). I don't see how one could read that and not take it as

You are running into exactly that problem; and it's visible when
working with technologies that are strict about following encoding
rules - in this case, psql when pgAdmin shells out to it.

I did think of one possible quick fix this morning which I'll look
into, but as I noted before; it's a workaround, and the real problem
is storing un-validated UTF-8 data in a SQL_ASCII database.

> Or, a reminder that postgreSQL can't help with any conversions you might want to do.
>
> Then there's this:
>>
>> PostgreSQL will allow superusers to create databases with SQL_ASCII encoding even when LC_CTYPE is not C or POSIX.
Asnoted above, SQL_ASCII does not enforce that the data stored in the database has any particular encoding, and so this
choiceposes risks of locale-dependent misbehavior. Using this combination of settings is deprecated and may someday be
forbiddenaltogether. 
>
>
> A note that you can currently choose incompatible settings, but probably can't in the future.
>
> And finally there's this bit of advice:
>>
>> If the client character set is defined as SQL_ASCII, encoding conversion is disabled, regardless of the server's
characterset. Just as for the server, use of SQL_ASCII is unwise unless you are working with all-ASCII data[emphasis
mine].
>
>
> Which is just a reiteration of the first caveat, that if you are using SQL_ASCII the database won't perform any
conversionson your behalf. 
>
> That is hardly a recommendation against using that supported encoding scheme.  The fact that the psql command prompt,
amongothers, works with it without issue, is an indication that the problem lies in pgAdmin4 (and I would guess the
relianceof python on UTF8) than an issue with the database itself.  pgAdmin4 needs to check for and more gracefully
handlevalid postgreSQL data that might happen to be not UTF8 compliant. 
>
> Until then, I will have to periodically scan and clean for bad UTF8 data to keep pgAdmin4 (and other JDBC dependent
code)happy.  The legacy enterprise .Net applications that depend on it prohibit converting it to UTF8 (or anything else
forthat matter). 
>
> Just my $0.02,
>
> rik.
>
>
> On Mon, Jan 7, 2019 at 1:27 PM Dave Page <dpage@pgadmin.org> wrote:
>>
>> Hi
>>
>> On Mon, Jan 7, 2019 at 11:30 PM richard coleman
>> <rcoleman.ascentgl@gmail.com> wrote:
>> >
>> > Dave,
>> >
>> > I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 specific problem, at least in so far as
SQL_ASCIIis concerned.  I say this because I can usually work with the data just fine from the psql prompt, but not
throughpgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection).  .Net/Windows ODBC drivers
andpsql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond
select/update/insert). pgAdmin4, SELECT, export, etc. BOOM! At least until you cleaned  up the offending bytes. 
>> >
>> > Just my $0.02.
>>
>> I'm afraid the fundamental problem is that you're using PostgreSQL in
>> a way that the docs specifically recommend against doing, and you're
>> seeing the reason why.
>>
>> pgAdmin 3 and 4 are completely different. In the import/export utility
>> that Nania reported the issue in, pgAdmin doesn't look at the data *at
>> all*. It simply executes \copy in psql, which does all the work. All
>> pgAdmin does is provide connection info and options to psql, based on
>> the selections made in the import/export dialogue, and executes it.
>>
>> In other areas of pgAdmin, like the query tool, it is possible to see
>> similar issues with the same underlying cause, though we've spent a
>> significant amount of time trying to work around all the possible edge
>> cases.
>>
>> pgAdmin 3 implemented import/export itself, using underlying libraries
>> that were far less strict about encoding rules than Python is. That
>> may have been more convenient for this particular issue, but it's a
>> lot worse in many others.
>>
>> As a general thought (and do bear in mind, we've spent significant
>> time and resources on these issues in the past), I'd far rather spend
>> time on new features and actual bugs, than further time on workarounds
>> for things the PostgreSQL docs specifically advise against doing.
>>
>> --
>> Dave Page
>> Blog: http://pgsnake.blogspot.com
>> Twitter: @pgsnake
>>
>> EnterpriseDB UK: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
richard coleman
Дата:
Dave, 

Thanks for continuing this discussion, but I think you misunderstand the situation.  I am storing valid non-UTF8 data in a SQL_ASCII encoded postgreSQL database (please re-read what I had previously written). This is why psql has NO problem dealing with it.  This is also why Windows ODBC and .Net applications have NO problem dealing with it.  In fact the most common character that pgAdmin4 crashes on is the Windows smart quote.  So to reiterate, I am using valid non-UTF8 characters in a SQL_ASCII database.  This is a supported configuration for postgreSQL.  The issue seems to be that pgAdmin4 is assuming  UTF8 data and crashing/failing/throwing errors when it encounters invalid UTF8 characters.

I hope I have made the situation a little bit clearer.

Thanks again, 

rik. 

On Tue, Jan 8, 2019 at 12:29 AM Dave Page <dpage@pgadmin.org> wrote:
Hi

On Tue, Jan 8, 2019 at 12:47 AM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> Thanks for taking the time to respond, but I don't see anywhere that SQL_ASCII is recommended against doing. Here's the documentation listing the supported encoding schemas: https://www.postgresql.org/docs/current/multibyte.html .
>
> The only caveats listed for SQL_ASCII are:
>>
>> In most cases, if you are working with any non-ASCII data, it is unwise to use the SQL_ASCII setting because PostgreSQL will be unable to help you by converting or validating non-ASCII characters.

You highlighted it below: "If the client character set is defined as
SQL_ASCII, encoding conversion is disabled, regardless of the server's
character set. Just as for the server, use of SQL_ASCII is unwise
unless you are working with all-ASCII data"

You're using UTF-8 data, not ASCII, which it says is unwise because
conversion won't take place (and consequently, neither will
validation). I don't see how one could read that and not take it as

You are running into exactly that problem; and it's visible when
working with technologies that are strict about following encoding
rules - in this case, psql when pgAdmin shells out to it.

I did think of one possible quick fix this morning which I'll look
into, but as I noted before; it's a workaround, and the real problem
is storing un-validated UTF-8 data in a SQL_ASCII database.

> Or, a reminder that postgreSQL can't help with any conversions you might want to do.
>
> Then there's this:
>>
>> PostgreSQL will allow superusers to create databases with SQL_ASCII encoding even when LC_CTYPE is not C or POSIX. As noted above, SQL_ASCII does not enforce that the data stored in the database has any particular encoding, and so this choice poses risks of locale-dependent misbehavior. Using this combination of settings is deprecated and may someday be forbidden altogether.
>
>
> A note that you can currently choose incompatible settings, but probably can't in the future.
>
> And finally there's this bit of advice:
>>
>> If the client character set is defined as SQL_ASCII, encoding conversion is disabled, regardless of the server's character set. Just as for the server, use of SQL_ASCII is unwise unless you are working with all-ASCII data[emphasis mine].
>
>
> Which is just a reiteration of the first caveat, that if you are using SQL_ASCII the database won't perform any conversions on your behalf.
>
> That is hardly a recommendation against using that supported encoding scheme.  The fact that the psql command prompt, among others, works with it without issue, is an indication that the problem lies in pgAdmin4 (and I would guess the reliance of python on UTF8) than an issue with the database itself.  pgAdmin4 needs to check for and more gracefully handle valid postgreSQL data that might happen to be not UTF8 compliant.
>
> Until then, I will have to periodically scan and clean for bad UTF8 data to keep pgAdmin4 (and other JDBC dependent code) happy.  The legacy enterprise .Net applications that depend on it prohibit converting it to UTF8 (or anything else for that matter).
>
> Just my $0.02,
>
> rik.
>
>
> On Mon, Jan 7, 2019 at 1:27 PM Dave Page <dpage@pgadmin.org> wrote:
>>
>> Hi
>>
>> On Mon, Jan 7, 2019 at 11:30 PM richard coleman
>> <rcoleman.ascentgl@gmail.com> wrote:
>> >
>> > Dave,
>> >
>> > I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 specific problem, at least in so far as SQL_ASCII is concerned.  I say this because I can usually work with the data just fine from the psql prompt, but not through pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection).  .Net/Windows ODBC drivers and psql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond select/update/insert).  pgAdmin4, SELECT, export, etc. BOOM! At least until you cleaned  up the offending bytes.
>> >
>> > Just my $0.02.
>>
>> I'm afraid the fundamental problem is that you're using PostgreSQL in
>> a way that the docs specifically recommend against doing, and you're
>> seeing the reason why.
>>
>> pgAdmin 3 and 4 are completely different. In the import/export utility
>> that Nania reported the issue in, pgAdmin doesn't look at the data *at
>> all*. It simply executes \copy in psql, which does all the work. All
>> pgAdmin does is provide connection info and options to psql, based on
>> the selections made in the import/export dialogue, and executes it.
>>
>> In other areas of pgAdmin, like the query tool, it is possible to see
>> similar issues with the same underlying cause, though we've spent a
>> significant amount of time trying to work around all the possible edge
>> cases.
>>
>> pgAdmin 3 implemented import/export itself, using underlying libraries
>> that were far less strict about encoding rules than Python is. That
>> may have been more convenient for this particular issue, but it's a
>> lot worse in many others.
>>
>> As a general thought (and do bear in mind, we've spent significant
>> time and resources on these issues in the past), I'd far rather spend
>> time on new features and actual bugs, than further time on workarounds
>> for things the PostgreSQL docs specifically advise against doing.
>>
>> --
>> Dave Page
>> Blog: http://pgsnake.blogspot.com
>> Twitter: @pgsnake
>>
>> EnterpriseDB UK: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
Dave Page
Дата:
Hi Rik

On Tue, Jan 8, 2019 at 6:53 PM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> Thanks for continuing this discussion, but I think you misunderstand the situation.  I am storing valid non-UTF8 data
ina SQL_ASCII encoded postgreSQL database (please re-read what I had previously written). This is why psql has NO
problemdealing with it.  This is also why Windows ODBC and .Net applications have NO problem dealing with it.  In fact
themost common character that pgAdmin4 crashes on is the Windows smart quote.  So to reiterate, I am using valid
non-UTF8characters in a SQL_ASCII database.  This is a supported configuration for postgreSQL.  The issue seems to be
thatpgAdmin4 is assuming  UTF8 data and crashing/failing/throwing errors when it encounters invalid UTF8 characters. 
>
> I hope I have made the situation a little bit clearer.

Well psql is failing to deal with it *in this case*, as that's what is
doing the \copy in the import/export tool.

In other cases (i.e. the ones where pgAdmin sees the data, such as
results in the query tool), the issue arises because Python and/or
Javascript (and by extension pgAdmin) may barf on data encoded in a
way they don't recognise. That's why the PostgreSQL docs say to only
use ASCII data in SQL_ASCII databases - the behaviour is undefined,
and as a result may either not render properly or may crash or error
on non-ASCII data.

Anyhoo, I expect to have a little time after dinner shortly so I'll
try out the workaround I thought of earlier to see if it helps (I
doubt it'll be a panacea, but it may help in some cases).

By any chance do you have a test case you can share with me that
refuses to export from pgAdmin (using the Import/Export tool)? If so,
I'd appreciate a copy of it to play with.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
richard coleman
Дата:
Dave, 

I would imagine Nanina would be in a better position to provide you with problematic import/export data in the short term.  I don't tend to import/export that often these days, preferring to use SQL statements for most things short of a full backup/restore (in my case I've found it to be much less picky). As mentioned previously, in my experience the characters that tend to trip up pgAdmin4 are Windows special characters.  I would imagine the upper Windows-1252 character set as being particularly problematic for pgAdmin4 if it is expecting proper UTF-8 (i.e. ŒœŠšŸŽžƒˆ˜–—‘’‚“”„†‡•…‰‹›€™).  This would explain why Windows ODBC, .Net, and pSQL have no problems dealing with the data.  I would imagine if it the database was set up with  ENCODING =  'WIN1252' then postgreSQL would do the translation into UTF-8 for pgAdmin4, but since it isn't postgreSQL can't provide pgAdmin4 with any help.  It's up to pgAdmin4 to deal with the otherwise valid data appropriately.

I hope your workaround pans out, until then I will spend my time at the psql prompt, or if the data is needed elsewhere run the two functions I had included previously to identify and remove the offensive characters.

Here's the create database script for one of my databases, perhaps it can shed some light (it was originally an 8.3 postgreSQL database {long before my time here, currently running under postgreSQL 10.x} and apparently back then it defaulted to creating SQL_ASCII encoded databases on Windows).

CREATE DATABASE tms_production
    WITH 
    OWNER = local_user
    ENCODING = 'SQL_ASCII'
    LC_COLLATE = 'English_United States.1252'
    LC_CTYPE = 'English_United States.1252'
    TABLESPACE = pg_default
    CONNECTION LIMIT = -1;
ALTER DATABASE tms_production
    SET default_transaction_read_only TO off;
ALTER DATABASE tms_production
    SET client_encoding TO sql_ascii;
ALTER DATABASE tms_production
    SET standard_conforming_strings TO off;  

rik. 

On Tue, Jan 8, 2019 at 8:37 AM Dave Page <dpage@pgadmin.org> wrote:
Hi Rik

On Tue, Jan 8, 2019 at 6:53 PM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> Thanks for continuing this discussion, but I think you misunderstand the situation.  I am storing valid non-UTF8 data in a SQL_ASCII encoded postgreSQL database (please re-read what I had previously written). This is why psql has NO problem dealing with it.  This is also why Windows ODBC and .Net applications have NO problem dealing with it.  In fact the most common character that pgAdmin4 crashes on is the Windows smart quote.  So to reiterate, I am using valid non-UTF8 characters in a SQL_ASCII database.  This is a supported configuration for postgreSQL.  The issue seems to be that pgAdmin4 is assuming  UTF8 data and crashing/failing/throwing errors when it encounters invalid UTF8 characters.
>
> I hope I have made the situation a little bit clearer.

Well psql is failing to deal with it *in this case*, as that's what is
doing the \copy in the import/export tool.

In other cases (i.e. the ones where pgAdmin sees the data, such as
results in the query tool), the issue arises because Python and/or
Javascript (and by extension pgAdmin) may barf on data encoded in a
way they don't recognise. That's why the PostgreSQL docs say to only
use ASCII data in SQL_ASCII databases - the behaviour is undefined,
and as a result may either not render properly or may crash or error
on non-ASCII data.

Anyhoo, I expect to have a little time after dinner shortly so I'll
try out the workaround I thought of earlier to see if it helps (I
doubt it'll be a panacea, but it may help in some cases).

By any chance do you have a test case you can share with me that
refuses to export from pgAdmin (using the Import/Export tool)? If so,
I'd appreciate a copy of it to play with.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
Dave Page
Дата:
Hi

On Tue, Jan 8, 2019 at 7:32 PM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> I would imagine Nanina would be in a better position to provide you with problematic import/export data in the short
term. I don't tend to import/export that often these days, preferring to use SQL statements for most things short of a
fullbackup/restore (in my case I've found it to be much less picky). As mentioned previously, in my experience the
charactersthat tend to trip up pgAdmin4 are Windows special characters.  I would imagine the upper Windows-1252
characterset as being particularly problematic for pgAdmin4 if it is expecting proper UTF-8 (i.e.
ŒœŠšŸŽžƒˆ˜–—‘’‚“”„†‡•…‰‹›€™). This would explain why Windows ODBC, .Net, and pSQL have no problems dealing with the
data. I would imagine if it the database was set up with  ENCODING =  'WIN1252' then postgreSQL would do the
translationinto UTF-8 for pgAdmin4, but since it isn't postgreSQL can't provide pgAdmin4 with any help. 

Right - and that's kinda the point. PostgreSQL is a database that is
designed to enforce integrity rules on your data, whether those be
around encoding, or table constraints, strong typing, foreign keys
etc. Those strengths are amongst the reasons most of us chose it in
the first place, rather than one of the NoSQL databases that are
usually much more forgiving in many respects.

> It's up to pgAdmin4 to deal with the otherwise valid data appropriately.

How can it? If there is no encoding specified (because you're using
SQL_ASCII, with values > 127) the behaviour is undefined by
definition. Any attempts to deal with such data will be hit and miss
because there is no possible way for pgAdmin to know how the data is
supposed to be interpreted. You know your data is Win1252 encoded, but
for all pgAdmin knows, it could be Win1253.

The best option is to use the correct encoding for the database, or if
you have data that really doesn't conform to any encoding standard,
use the bytea datatype.

Anyway, I've said my piece. I'll go investigate the workaround in a
moment and report back.

> I hope your workaround pans out, until then I will spend my time at the psql prompt, or if the data is needed
elsewhererun the two functions I had included previously to identify and remove the offensive characters. 
>
> Here's the create database script for one of my databases, perhaps it can shed some light (it was originally an 8.3
postgreSQLdatabase {long before my time here, currently running under postgreSQL 10.x} and apparently back then it
defaultedto creating SQL_ASCII encoded databases on Windows). 
>
>> CREATE DATABASE tms_production
>>     WITH
>>     OWNER = local_user
>>     ENCODING = 'SQL_ASCII'
>>     LC_COLLATE = 'English_United States.1252'
>>     LC_CTYPE = 'English_United States.1252'
>>     TABLESPACE = pg_default
>>     CONNECTION LIMIT = -1;
>> ALTER DATABASE tms_production
>>     SET default_transaction_read_only TO off;
>> ALTER DATABASE tms_production
>>     SET client_encoding TO sql_ascii;
>> ALTER DATABASE tms_production
>>     SET standard_conforming_strings TO off;

Thanks!


> On Tue, Jan 8, 2019 at 8:37 AM Dave Page <dpage@pgadmin.org> wrote:
>>
>> Hi Rik
>>
>> On Tue, Jan 8, 2019 at 6:53 PM richard coleman
>> <rcoleman.ascentgl@gmail.com> wrote:
>> >
>> > Dave,
>> >
>> > Thanks for continuing this discussion, but I think you misunderstand the situation.  I am storing valid non-UTF8
datain a SQL_ASCII encoded postgreSQL database (please re-read what I had previously written). This is why psql has NO
problemdealing with it.  This is also why Windows ODBC and .Net applications have NO problem dealing with it.  In fact
themost common character that pgAdmin4 crashes on is the Windows smart quote.  So to reiterate, I am using valid
non-UTF8characters in a SQL_ASCII database.  This is a supported configuration for postgreSQL.  The issue seems to be
thatpgAdmin4 is assuming  UTF8 data and crashing/failing/throwing errors when it encounters invalid UTF8 characters. 
>> >
>> > I hope I have made the situation a little bit clearer.
>>
>> Well psql is failing to deal with it *in this case*, as that's what is
>> doing the \copy in the import/export tool.
>>
>> In other cases (i.e. the ones where pgAdmin sees the data, such as
>> results in the query tool), the issue arises because Python and/or
>> Javascript (and by extension pgAdmin) may barf on data encoded in a
>> way they don't recognise. That's why the PostgreSQL docs say to only
>> use ASCII data in SQL_ASCII databases - the behaviour is undefined,
>> and as a result may either not render properly or may crash or error
>> on non-ASCII data.
>>
>> Anyhoo, I expect to have a little time after dinner shortly so I'll
>> try out the workaround I thought of earlier to see if it helps (I
>> doubt it'll be a panacea, but it may help in some cases).
>>
>> By any chance do you have a test case you can share with me that
>> refuses to export from pgAdmin (using the Import/Export tool)? If so,
>> I'd appreciate a copy of it to play with.
>>
>> --
>> Dave Page
>> Blog: http://pgsnake.blogspot.com
>> Twitter: @pgsnake
>>
>> EnterpriseDB UK: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
Doug Easterbrook
Дата:
https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html

Smart quotes are not ascii.   They are Unicode, strictly speaking .

We’ve Postgres reject  putting such stuff into the database using anything .



Sent from my iPad

On Jan 8, 2019, at 9:23 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

Dave, 

Thanks for continuing this discussion, but I think you misunderstand the situation.  I am storing valid non-UTF8 data in a SQL_ASCII encoded postgreSQL database (please re-read what I had previously written). This is why psql has NO problem dealing with it.  This is also why Windows ODBC and .Net applications have NO problem dealing with it.  In fact the most common character that pgAdmin4 crashes on is the Windows smart quote.  So to reiterate, I am using valid non-UTF8 characters in a SQL_ASCII database.  This is a supported configuration for postgreSQL.  The issue seems to be that pgAdmin4 is assuming  UTF8 data and crashing/failing/throwing errors when it encounters invalid UTF8 characters.

I hope I have made the situation a little bit clearer.

Thanks again, 

rik. 

On Tue, Jan 8, 2019 at 12:29 AM Dave Page <dpage@pgadmin.org> wrote:
Hi

On Tue, Jan 8, 2019 at 12:47 AM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> Thanks for taking the time to respond, but I don't see anywhere that SQL_ASCII is recommended against doing. Here's the documentation listing the supported encoding schemas: https://www.postgresql.org/docs/current/multibyte.html .
>
> The only caveats listed for SQL_ASCII are:
>>
>> In most cases, if you are working with any non-ASCII data, it is unwise to use the SQL_ASCII setting because PostgreSQL will be unable to help you by converting or validating non-ASCII characters.

You highlighted it below: "If the client character set is defined as
SQL_ASCII, encoding conversion is disabled, regardless of the server's
character set. Just as for the server, use of SQL_ASCII is unwise
unless you are working with all-ASCII data"

You're using UTF-8 data, not ASCII, which it says is unwise because
conversion won't take place (and consequently, neither will
validation). I don't see how one could read that and not take it as

You are running into exactly that problem; and it's visible when
working with technologies that are strict about following encoding
rules - in this case, psql when pgAdmin shells out to it.

I did think of one possible quick fix this morning which I'll look
into, but as I noted before; it's a workaround, and the real problem
is storing un-validated UTF-8 data in a SQL_ASCII database.

> Or, a reminder that postgreSQL can't help with any conversions you might want to do.
>
> Then there's this:
>>
>> PostgreSQL will allow superusers to create databases with SQL_ASCII encoding even when LC_CTYPE is not C or POSIX. As noted above, SQL_ASCII does not enforce that the data stored in the database has any particular encoding, and so this choice poses risks of locale-dependent misbehavior. Using this combination of settings is deprecated and may someday be forbidden altogether.
>
>
> A note that you can currently choose incompatible settings, but probably can't in the future.
>
> And finally there's this bit of advice:
>>
>> If the client character set is defined as SQL_ASCII, encoding conversion is disabled, regardless of the server's character set. Just as for the server, use of SQL_ASCII is unwise unless you are working with all-ASCII data[emphasis mine].
>
>
> Which is just a reiteration of the first caveat, that if you are using SQL_ASCII the database won't perform any conversions on your behalf.
>
> That is hardly a recommendation against using that supported encoding scheme.  The fact that the psql command prompt, among others, works with it without issue, is an indication that the problem lies in pgAdmin4 (and I would guess the reliance of python on UTF8) than an issue with the database itself.  pgAdmin4 needs to check for and more gracefully handle valid postgreSQL data that might happen to be not UTF8 compliant.
>
> Until then, I will have to periodically scan and clean for bad UTF8 data to keep pgAdmin4 (and other JDBC dependent code) happy.  The legacy enterprise .Net applications that depend on it prohibit converting it to UTF8 (or anything else for that matter).
>
> Just my $0.02,
>
> rik.
>
>
> On Mon, Jan 7, 2019 at 1:27 PM Dave Page <dpage@pgadmin.org> wrote:
>>
>> Hi
>>
>> On Mon, Jan 7, 2019 at 11:30 PM richard coleman
>> <rcoleman.ascentgl@gmail.com> wrote:
>> >
>> > Dave,
>> >
>> > I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 specific problem, at least in so far as SQL_ASCII is concerned.  I say this because I can usually work with the data just fine from the psql prompt, but not through pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection).  .Net/Windows ODBC drivers and psql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond select/update/insert).  pgAdmin4, SELECT, export, etc. BOOM! At least until you cleaned  up the offending bytes.
>> >
>> > Just my $0.02.
>>
>> I'm afraid the fundamental problem is that you're using PostgreSQL in
>> a way that the docs specifically recommend against doing, and you're
>> seeing the reason why.
>>
>> pgAdmin 3 and 4 are completely different. In the import/export utility
>> that Nania reported the issue in, pgAdmin doesn't look at the data *at
>> all*. It simply executes \copy in psql, which does all the work. All
>> pgAdmin does is provide connection info and options to psql, based on
>> the selections made in the import/export dialogue, and executes it.
>>
>> In other areas of pgAdmin, like the query tool, it is possible to see
>> similar issues with the same underlying cause, though we've spent a
>> significant amount of time trying to work around all the possible edge
>> cases.
>>
>> pgAdmin 3 implemented import/export itself, using underlying libraries
>> that were far less strict about encoding rules than Python is. That
>> may have been more convenient for this particular issue, but it's a
>> lot worse in many others.
>>
>> As a general thought (and do bear in mind, we've spent significant
>> time and resources on these issues in the past), I'd far rather spend
>> time on new features and actual bugs, than further time on workarounds
>> for things the PostgreSQL docs specifically advise against doing.
>>
>> --
>> Dave Page
>> Blog: http://pgsnake.blogspot.com
>> Twitter: @pgsnake
>>
>> EnterpriseDB UK: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
richard coleman
Дата:
Doug, 

Hi.  Is this a typo?

We’ve Postgres reject  putting such stuff into the database using anything .

PostgreSQL allow any text into a column if the database has been set to use SQL_ASCII as the encoding. This was the default encoding on Windows machines, a long long time ago.  All that setting the encoding to SQ:_ASCII means is that postgreSQL won't do any translations between character encoding for you.  We are storing what I believe is Windows-1252 characters in the database.  The immediate problem (at least mine in so far as pgAdmin4 is concerned) is that non-UTF8 characters (like smart quotes) are handled without a problem by psql, .Net, and Windows ODBC (and under pgAdmin3) but cause pgAdmin4 to crash/fail/thow errors.  The only way to get pgAdmin4 to behave is to locate and remove the offending characters.  My guess is that this is due to it's python underpinnings and the expectation that the database would correctly translate the data into valid UTF8.  It is my belief that it should handle such valid postgreSQL data better than it currently does.  At least as well as it's predecessor pgAdmin3 did.

Just my $0.02, 

rik.


On Tue, Jan 8, 2019 at 3:56 PM Doug Easterbrook <doug@artsman.com> wrote:
https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html

Smart quotes are not ascii.   They are Unicode, strictly speaking .

We’ve Postgres reject  putting such stuff into the database using anything .



Sent from my iPad

On Jan 8, 2019, at 9:23 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

Dave, 

Thanks for continuing this discussion, but I think you misunderstand the situation.  I am storing valid non-UTF8 data in a SQL_ASCII encoded postgreSQL database (please re-read what I had previously written). This is why psql has NO problem dealing with it.  This is also why Windows ODBC and .Net applications have NO problem dealing with it.  In fact the most common character that pgAdmin4 crashes on is the Windows smart quote.  So to reiterate, I am using valid non-UTF8 characters in a SQL_ASCII database.  This is a supported configuration for postgreSQL.  The issue seems to be that pgAdmin4 is assuming  UTF8 data and crashing/failing/throwing errors when it encounters invalid UTF8 characters.

I hope I have made the situation a little bit clearer.

Thanks again, 

rik. 

On Tue, Jan 8, 2019 at 12:29 AM Dave Page <dpage@pgadmin.org> wrote:
Hi

On Tue, Jan 8, 2019 at 12:47 AM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> Thanks for taking the time to respond, but I don't see anywhere that SQL_ASCII is recommended against doing. Here's the documentation listing the supported encoding schemas: https://www.postgresql.org/docs/current/multibyte.html .
>
> The only caveats listed for SQL_ASCII are:
>>
>> In most cases, if you are working with any non-ASCII data, it is unwise to use the SQL_ASCII setting because PostgreSQL will be unable to help you by converting or validating non-ASCII characters.

You highlighted it below: "If the client character set is defined as
SQL_ASCII, encoding conversion is disabled, regardless of the server's
character set. Just as for the server, use of SQL_ASCII is unwise
unless you are working with all-ASCII data"

You're using UTF-8 data, not ASCII, which it says is unwise because
conversion won't take place (and consequently, neither will
validation). I don't see how one could read that and not take it as

You are running into exactly that problem; and it's visible when
working with technologies that are strict about following encoding
rules - in this case, psql when pgAdmin shells out to it.

I did think of one possible quick fix this morning which I'll look
into, but as I noted before; it's a workaround, and the real problem
is storing un-validated UTF-8 data in a SQL_ASCII database.

> Or, a reminder that postgreSQL can't help with any conversions you might want to do.
>
> Then there's this:
>>
>> PostgreSQL will allow superusers to create databases with SQL_ASCII encoding even when LC_CTYPE is not C or POSIX. As noted above, SQL_ASCII does not enforce that the data stored in the database has any particular encoding, and so this choice poses risks of locale-dependent misbehavior. Using this combination of settings is deprecated and may someday be forbidden altogether.
>
>
> A note that you can currently choose incompatible settings, but probably can't in the future.
>
> And finally there's this bit of advice:
>>
>> If the client character set is defined as SQL_ASCII, encoding conversion is disabled, regardless of the server's character set. Just as for the server, use of SQL_ASCII is unwise unless you are working with all-ASCII data[emphasis mine].
>
>
> Which is just a reiteration of the first caveat, that if you are using SQL_ASCII the database won't perform any conversions on your behalf.
>
> That is hardly a recommendation against using that supported encoding scheme.  The fact that the psql command prompt, among others, works with it without issue, is an indication that the problem lies in pgAdmin4 (and I would guess the reliance of python on UTF8) than an issue with the database itself.  pgAdmin4 needs to check for and more gracefully handle valid postgreSQL data that might happen to be not UTF8 compliant.
>
> Until then, I will have to periodically scan and clean for bad UTF8 data to keep pgAdmin4 (and other JDBC dependent code) happy.  The legacy enterprise .Net applications that depend on it prohibit converting it to UTF8 (or anything else for that matter).
>
> Just my $0.02,
>
> rik.
>
>
> On Mon, Jan 7, 2019 at 1:27 PM Dave Page <dpage@pgadmin.org> wrote:
>>
>> Hi
>>
>> On Mon, Jan 7, 2019 at 11:30 PM richard coleman
>> <rcoleman.ascentgl@gmail.com> wrote:
>> >
>> > Dave,
>> >
>> > I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 specific problem, at least in so far as SQL_ASCII is concerned.  I say this because I can usually work with the data just fine from the psql prompt, but not through pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection).  .Net/Windows ODBC drivers and psql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond select/update/insert).  pgAdmin4, SELECT, export, etc. BOOM! At least until you cleaned  up the offending bytes.
>> >
>> > Just my $0.02.
>>
>> I'm afraid the fundamental problem is that you're using PostgreSQL in
>> a way that the docs specifically recommend against doing, and you're
>> seeing the reason why.
>>
>> pgAdmin 3 and 4 are completely different. In the import/export utility
>> that Nania reported the issue in, pgAdmin doesn't look at the data *at
>> all*. It simply executes \copy in psql, which does all the work. All
>> pgAdmin does is provide connection info and options to psql, based on
>> the selections made in the import/export dialogue, and executes it.
>>
>> In other areas of pgAdmin, like the query tool, it is possible to see
>> similar issues with the same underlying cause, though we've spent a
>> significant amount of time trying to work around all the possible edge
>> cases.
>>
>> pgAdmin 3 implemented import/export itself, using underlying libraries
>> that were far less strict about encoding rules than Python is. That
>> may have been more convenient for this particular issue, but it's a
>> lot worse in many others.
>>
>> As a general thought (and do bear in mind, we've spent significant
>> time and resources on these issues in the past), I'd far rather spend
>> time on new features and actual bugs, than further time on workarounds
>> for things the PostgreSQL docs specifically advise against doing.
>>
>> --
>> Dave Page
>> Blog: http://pgsnake.blogspot.com
>> Twitter: @pgsnake
>>
>> EnterpriseDB UK: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
Doug Easterbrook
Дата:
Had a rethink of this.      We had sql ascii databases with non ascii data in it.   When converting that data to utf8 and sticking into Postgres, all that windows extended characters could not be used unless converted to utf8 before hand.

It may be that your case depends 9n the character coding issued by the client as it overrides the database encoding to some extent.

So.. you may want to just give it up and go utf8 anyway,    It sets you up for the future.

Or go get pgadmin 3 from bigsql ...  it will talk to Postgres 10 for sure and I think it talks to Postgres 11.      I use both pgadmin 3 and 4 since there are some things that both do well. 

Sent from my iPad

On Jan 8, 2019, at 5:44 PM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

Doug, 

Hi.  Is this a typo?

We’ve Postgres reject  putting such stuff into the database using anything .

PostgreSQL allow any text into a column if the database has been set to use SQL_ASCII as the encoding. This was the default encoding on Windows machines, a long long time ago.  All that setting the encoding to SQ:_ASCII means is that postgreSQL won't do any translations between character encoding for you.  We are storing what I believe is Windows-1252 characters in the database.  The immediate problem (at least mine in so far as pgAdmin4 is concerned) is that non-UTF8 characters (like smart quotes) are handled without a problem by psql, .Net, and Windows ODBC (and under pgAdmin3) but cause pgAdmin4 to crash/fail/thow errors.  The only way to get pgAdmin4 to behave is to locate and remove the offending characters.  My guess is that this is due to it's python underpinnings and the expectation that the database would correctly translate the data into valid UTF8.  It is my belief that it should handle such valid postgreSQL data better than it currently does.  At least as well as it's predecessor pgAdmin3 did.

Just my $0.02, 

rik.


On Tue, Jan 8, 2019 at 3:56 PM Doug Easterbrook <doug@artsman.com> wrote:
https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html

Smart quotes are not ascii.   They are Unicode, strictly speaking .

We’ve Postgres reject  putting such stuff into the database using anything .



Sent from my iPad

On Jan 8, 2019, at 9:23 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

Dave, 

Thanks for continuing this discussion, but I think you misunderstand the situation.  I am storing valid non-UTF8 data in a SQL_ASCII encoded postgreSQL database (please re-read what I had previously written). This is why psql has NO problem dealing with it.  This is also why Windows ODBC and .Net applications have NO problem dealing with it.  In fact the most common character that pgAdmin4 crashes on is the Windows smart quote.  So to reiterate, I am using valid non-UTF8 characters in a SQL_ASCII database.  This is a supported configuration for postgreSQL.  The issue seems to be that pgAdmin4 is assuming  UTF8 data and crashing/failing/throwing errors when it encounters invalid UTF8 characters.

I hope I have made the situation a little bit clearer.

Thanks again, 

rik. 

On Tue, Jan 8, 2019 at 12:29 AM Dave Page <dpage@pgadmin.org> wrote:
Hi

On Tue, Jan 8, 2019 at 12:47 AM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> Thanks for taking the time to respond, but I don't see anywhere that SQL_ASCII is recommended against doing. Here's the documentation listing the supported encoding schemas: https://www.postgresql.org/docs/current/multibyte.html .
>
> The only caveats listed for SQL_ASCII are:
>>
>> In most cases, if you are working with any non-ASCII data, it is unwise to use the SQL_ASCII setting because PostgreSQL will be unable to help you by converting or validating non-ASCII characters.

You highlighted it below: "If the client character set is defined as
SQL_ASCII, encoding conversion is disabled, regardless of the server's
character set. Just as for the server, use of SQL_ASCII is unwise
unless you are working with all-ASCII data"

You're using UTF-8 data, not ASCII, which it says is unwise because
conversion won't take place (and consequently, neither will
validation). I don't see how one could read that and not take it as

You are running into exactly that problem; and it's visible when
working with technologies that are strict about following encoding
rules - in this case, psql when pgAdmin shells out to it.

I did think of one possible quick fix this morning which I'll look
into, but as I noted before; it's a workaround, and the real problem
is storing un-validated UTF-8 data in a SQL_ASCII database.

> Or, a reminder that postgreSQL can't help with any conversions you might want to do.
>
> Then there's this:
>>
>> PostgreSQL will allow superusers to create databases with SQL_ASCII encoding even when LC_CTYPE is not C or POSIX. As noted above, SQL_ASCII does not enforce that the data stored in the database has any particular encoding, and so this choice poses risks of locale-dependent misbehavior. Using this combination of settings is deprecated and may someday be forbidden altogether.
>
>
> A note that you can currently choose incompatible settings, but probably can't in the future.
>
> And finally there's this bit of advice:
>>
>> If the client character set is defined as SQL_ASCII, encoding conversion is disabled, regardless of the server's character set. Just as for the server, use of SQL_ASCII is unwise unless you are working with all-ASCII data[emphasis mine].
>
>
> Which is just a reiteration of the first caveat, that if you are using SQL_ASCII the database won't perform any conversions on your behalf.
>
> That is hardly a recommendation against using that supported encoding scheme.  The fact that the psql command prompt, among others, works with it without issue, is an indication that the problem lies in pgAdmin4 (and I would guess the reliance of python on UTF8) than an issue with the database itself.  pgAdmin4 needs to check for and more gracefully handle valid postgreSQL data that might happen to be not UTF8 compliant.
>
> Until then, I will have to periodically scan and clean for bad UTF8 data to keep pgAdmin4 (and other JDBC dependent code) happy.  The legacy enterprise .Net applications that depend on it prohibit converting it to UTF8 (or anything else for that matter).
>
> Just my $0.02,
>
> rik.
>
>
> On Mon, Jan 7, 2019 at 1:27 PM Dave Page <dpage@pgadmin.org> wrote:
>>
>> Hi
>>
>> On Mon, Jan 7, 2019 at 11:30 PM richard coleman
>> <rcoleman.ascentgl@gmail.com> wrote:
>> >
>> > Dave,
>> >
>> > I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 specific problem, at least in so far as SQL_ASCII is concerned.  I say this because I can usually work with the data just fine from the psql prompt, but not through pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection).  .Net/Windows ODBC drivers and psql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond select/update/insert).  pgAdmin4, SELECT, export, etc. BOOM! At least until you cleaned  up the offending bytes.
>> >
>> > Just my $0.02.
>>
>> I'm afraid the fundamental problem is that you're using PostgreSQL in
>> a way that the docs specifically recommend against doing, and you're
>> seeing the reason why.
>>
>> pgAdmin 3 and 4 are completely different. In the import/export utility
>> that Nania reported the issue in, pgAdmin doesn't look at the data *at
>> all*. It simply executes \copy in psql, which does all the work. All
>> pgAdmin does is provide connection info and options to psql, based on
>> the selections made in the import/export dialogue, and executes it.
>>
>> In other areas of pgAdmin, like the query tool, it is possible to see
>> similar issues with the same underlying cause, though we've spent a
>> significant amount of time trying to work around all the possible edge
>> cases.
>>
>> pgAdmin 3 implemented import/export itself, using underlying libraries
>> that were far less strict about encoding rules than Python is. That
>> may have been more convenient for this particular issue, but it's a
>> lot worse in many others.
>>
>> As a general thought (and do bear in mind, we've spent significant
>> time and resources on these issues in the past), I'd far rather spend
>> time on new features and actual bugs, than further time on workarounds
>> for things the PostgreSQL docs specifically advise against doing.
>>
>> --
>> Dave Page
>> Blog: http://pgsnake.blogspot.com
>> Twitter: @pgsnake
>>
>> EnterpriseDB UK: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Import/Export failure due to UTF-8 error in pgAdmin4 but not in pgAdmin3

От
Linus Hicks
Дата:
I'm not a PostgreSQL expert, but especially since python plays a role here, the system language must be taken into account. If you are wanting no conversion to be done on your "ASCII" data, then you will probably have to go through special hoops.

On Tue, Jan 8, 2019, 5:19 PM Doug Easterbrook <doug@artsman.com wrote:
Had a rethink of this.      We had sql ascii databases with non ascii data in it.   When converting that data to utf8 and sticking into Postgres, all that windows extended characters could not be used unless converted to utf8 before hand.

It may be that your case depends 9n the character coding issued by the client as it overrides the database encoding to some extent.

So.. you may want to just give it up and go utf8 anyway,    It sets you up for the future.

Or go get pgadmin 3 from bigsql ...  it will talk to Postgres 10 for sure and I think it talks to Postgres 11.      I use both pgadmin 3 and 4 since there are some things that both do well. 

Sent from my iPad

On Jan 8, 2019, at 5:44 PM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

Doug, 

Hi.  Is this a typo?

We’ve Postgres reject  putting such stuff into the database using anything .

PostgreSQL allow any text into a column if the database has been set to use SQL_ASCII as the encoding. This was the default encoding on Windows machines, a long long time ago.  All that setting the encoding to SQ:_ASCII means is that postgreSQL won't do any translations between character encoding for you.  We are storing what I believe is Windows-1252 characters in the database.  The immediate problem (at least mine in so far as pgAdmin4 is concerned) is that non-UTF8 characters (like smart quotes) are handled without a problem by psql, .Net, and Windows ODBC (and under pgAdmin3) but cause pgAdmin4 to crash/fail/thow errors.  The only way to get pgAdmin4 to behave is to locate and remove the offending characters.  My guess is that this is due to it's python underpinnings and the expectation that the database would correctly translate the data into valid UTF8.  It is my belief that it should handle such valid postgreSQL data better than it currently does.  At least as well as it's predecessor pgAdmin3 did.

Just my $0.02, 

rik.


On Tue, Jan 8, 2019 at 3:56 PM Doug Easterbrook <doug@artsman.com> wrote:
https://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html

Smart quotes are not ascii.   They are Unicode, strictly speaking .

We’ve Postgres reject  putting such stuff into the database using anything .



Sent from my iPad

On Jan 8, 2019, at 9:23 AM, richard coleman <rcoleman.ascentgl@gmail.com> wrote:

Dave, 

Thanks for continuing this discussion, but I think you misunderstand the situation.  I am storing valid non-UTF8 data in a SQL_ASCII encoded postgreSQL database (please re-read what I had previously written). This is why psql has NO problem dealing with it.  This is also why Windows ODBC and .Net applications have NO problem dealing with it.  In fact the most common character that pgAdmin4 crashes on is the Windows smart quote.  So to reiterate, I am using valid non-UTF8 characters in a SQL_ASCII database.  This is a supported configuration for postgreSQL.  The issue seems to be that pgAdmin4 is assuming  UTF8 data and crashing/failing/throwing errors when it encounters invalid UTF8 characters.

I hope I have made the situation a little bit clearer.

Thanks again, 

rik. 

On Tue, Jan 8, 2019 at 12:29 AM Dave Page <dpage@pgadmin.org> wrote:
Hi

On Tue, Jan 8, 2019 at 12:47 AM richard coleman
<rcoleman.ascentgl@gmail.com> wrote:
>
> Dave,
>
> Thanks for taking the time to respond, but I don't see anywhere that SQL_ASCII is recommended against doing. Here's the documentation listing the supported encoding schemas: https://www.postgresql.org/docs/current/multibyte.html .
>
> The only caveats listed for SQL_ASCII are:
>>
>> In most cases, if you are working with any non-ASCII data, it is unwise to use the SQL_ASCII setting because PostgreSQL will be unable to help you by converting or validating non-ASCII characters.

You highlighted it below: "If the client character set is defined as
SQL_ASCII, encoding conversion is disabled, regardless of the server's
character set. Just as for the server, use of SQL_ASCII is unwise
unless you are working with all-ASCII data"

You're using UTF-8 data, not ASCII, which it says is unwise because
conversion won't take place (and consequently, neither will
validation). I don't see how one could read that and not take it as

You are running into exactly that problem; and it's visible when
working with technologies that are strict about following encoding
rules - in this case, psql when pgAdmin shells out to it.

I did think of one possible quick fix this morning which I'll look
into, but as I noted before; it's a workaround, and the real problem
is storing un-validated UTF-8 data in a SQL_ASCII database.

> Or, a reminder that postgreSQL can't help with any conversions you might want to do.
>
> Then there's this:
>>
>> PostgreSQL will allow superusers to create databases with SQL_ASCII encoding even when LC_CTYPE is not C or POSIX. As noted above, SQL_ASCII does not enforce that the data stored in the database has any particular encoding, and so this choice poses risks of locale-dependent misbehavior. Using this combination of settings is deprecated and may someday be forbidden altogether.
>
>
> A note that you can currently choose incompatible settings, but probably can't in the future.
>
> And finally there's this bit of advice:
>>
>> If the client character set is defined as SQL_ASCII, encoding conversion is disabled, regardless of the server's character set. Just as for the server, use of SQL_ASCII is unwise unless you are working with all-ASCII data[emphasis mine].
>
>
> Which is just a reiteration of the first caveat, that if you are using SQL_ASCII the database won't perform any conversions on your behalf.
>
> That is hardly a recommendation against using that supported encoding scheme.  The fact that the psql command prompt, among others, works with it without issue, is an indication that the problem lies in pgAdmin4 (and I would guess the reliance of python on UTF8) than an issue with the database itself.  pgAdmin4 needs to check for and more gracefully handle valid postgreSQL data that might happen to be not UTF8 compliant.
>
> Until then, I will have to periodically scan and clean for bad UTF8 data to keep pgAdmin4 (and other JDBC dependent code) happy.  The legacy enterprise .Net applications that depend on it prohibit converting it to UTF8 (or anything else for that matter).
>
> Just my $0.02,
>
> rik.
>
>
> On Mon, Jan 7, 2019 at 1:27 PM Dave Page <dpage@pgadmin.org> wrote:
>>
>> Hi
>>
>> On Mon, Jan 7, 2019 at 11:30 PM richard coleman
>> <rcoleman.ascentgl@gmail.com> wrote:
>> >
>> > Dave,
>> >
>> > I can't speak to Nania's specific issue, but I believe it's a pgAdmin4 specific problem, at least in so far as SQL_ASCII is concerned.  I say this because I can usually work with the data just fine from the psql prompt, but not through pgAdmin4 (or other postgreSQL GUI's like dBeaver that rely on the JDBC connection).  .Net/Windows ODBC drivers and psql command prompt, no problem (as was pgAdmin3 assuming you don't do too much with it beyond select/update/insert).  pgAdmin4, SELECT, export, etc. BOOM! At least until you cleaned  up the offending bytes.
>> >
>> > Just my $0.02.
>>
>> I'm afraid the fundamental problem is that you're using PostgreSQL in
>> a way that the docs specifically recommend against doing, and you're
>> seeing the reason why.
>>
>> pgAdmin 3 and 4 are completely different. In the import/export utility
>> that Nania reported the issue in, pgAdmin doesn't look at the data *at
>> all*. It simply executes \copy in psql, which does all the work. All
>> pgAdmin does is provide connection info and options to psql, based on
>> the selections made in the import/export dialogue, and executes it.
>>
>> In other areas of pgAdmin, like the query tool, it is possible to see
>> similar issues with the same underlying cause, though we've spent a
>> significant amount of time trying to work around all the possible edge
>> cases.
>>
>> pgAdmin 3 implemented import/export itself, using underlying libraries
>> that were far less strict about encoding rules than Python is. That
>> may have been more convenient for this particular issue, but it's a
>> lot worse in many others.
>>
>> As a general thought (and do bear in mind, we've spent significant
>> time and resources on these issues in the past), I'd far rather spend
>> time on new features and actual bugs, than further time on workarounds
>> for things the PostgreSQL docs specifically advise against doing.
>>
>> --
>> Dave Page
>> Blog: http://pgsnake.blogspot.com
>> Twitter: @pgsnake
>>
>> EnterpriseDB UK: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company



--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company