UTF8 conversion revisited
От | Geoffrey Myers |
---|---|
Тема | UTF8 conversion revisited |
Дата | |
Msg-id | 4D922641.2030102@serioustechnology.com обсуждение исходный текст |
Список | pgsql-general |
So, we are still having an issue with this and I thought I'd throw this out to the list to see if I'm missing something. Basically, we have identified the tables/fields we need to convert. I'm running the following perl code against the fields and re-inserting the 'fixed' code into the field: data =~ s/(.)/((ord($1) >= 0) && (ord($1) <= 8)) || (ord($1) == 11) || ((ord($1) >= 13) && (ord($1) <= 31)) || ((ord($1) >= 127)) ?"": $1/egs; This appears to be working as a large number of records are cleaned. Problem is, someone it's not fixing data that contains the hex value 0xbd, as when I attempt to dump this database and create a new one with the UTF8 encoding I get the following error: pg_restore: [archiver (db)] Error while PROCESSING TOC: pg_restore: [archiver (db)] Error from TOC entry 5246; 0 4978675 TABLE DATA cust postgres pg_restore: [archiver (db)] COPY failed: ERROR: invalid byte sequence for encoding "UTF8": 0xbd As I see it, the perl code above should catch this '0xbd' character, but somehow it is finding it's way through. Any insights would be greatly appreciated. -- Until later, Geoffrey "I predict future happiness for America if they can prevent the government from wasting the labors of the people under the pretense of taking care of them." - Thomas Jefferson
В списке pgsql-general по дате отправления: