Re: encoding confusion
От | Richard Huxton |
---|---|
Тема | Re: encoding confusion |
Дата | |
Msg-id | 484E9A2A.40705@archonet.com обсуждение исходный текст |
Ответ на | encoding confusion (Sim Zacks <sim@compulab.co.il>) |
Ответы |
Re: encoding confusion
|
Список | pgsql-general |
Sim Zacks wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > {BACKGROUND] > I am testing dbmail for our corporate email solution. > > We originally tested it on mysql and now we are migrating it to postgresql. > > The messages are stored in a longblob field on mysql and a bytea field > in postgresql. > > I set the database up as UTF-8 Not relevant if you're using bytea. Encoding is for character-based types (varchar, text) not byte-based types. [snip] > When I used pygresql's escape_bytea function to copy the data, it went > smoothly, but the data was corrupt. > When I tried the escape_string function it died because the data it was > moving was not UTF-8. Your Python script seems to think it's dealing it's dealing with text rather than a stream of bytes too. I'm not a Python programmer, but I'd guess it's treating one of the database fields (either MySQL or PostgreSQL) as text not bytes. You'll need to check the docs for binary-data handling in your Python libraries. I'm puzzled as to how the data was corrupted with escape_bytea() - I can't imagine it would be that difficult for the library to get right. I'd be suspicious that the source data wasn't what I thought it was. > [CONFUSION] > What I don't understand, is that if that database can't handle the non > UTF characters, how does it allow them in when I receive an email > (tested, it works) and when I restored the backup. > I also don't understand why the data transfer didn't work to a UTF > database, but it worked to an ASCII database, if the data can go into a > UTF database from an ascii database. Whatever is going on, it's nothing to do with the bytea type. > Lastly, I wanted to know if anybody has experience moving data from > mysql to postgresql and if what I did is considered normal, or if there > is a better way of doing this. I think that something in the process is trying to be clever and treating the blobs as text. -- Richard Huxton Archonet Ltd
В списке pgsql-general по дате отправления: