Re: ENCODING (Unicode)
От | Jean-Michel POURE |
---|---|
Тема | Re: ENCODING (Unicode) |
Дата | |
Msg-id | 200305210950.52815.jm.poure@freesurf.fr обсуждение исходный текст |
Ответ на | Re: ENCODING (Unicode) (Reshat Sabiq <sabiq@purdue.edu>) |
Список | pgadmin-support |
Le Mercredi 21 Mai 2003 09:10, Reshat Sabiq a écrit : > Given that i can insert and retrieve Unicode values into either ASCII-based > or Unicode-based DB, is Unicode-based DB less efficient? I remember reading > something about it a while ago. I don't see immediately why that would be > the case though, because special characters are 2 bytes either way, > assuming we are not simplifying Unicode characters into ASCII. Dear Reshat, In unicode (UTF-8), characters are coded on 1 byte (US-English letters), 2 bytes (Western and Eastern Europe languages) and 3 bytes (all other languages including Asian and Indian languages). Technically, you can store UTF-8 values in an ASCII-based database. But, storing UTF-8 in an ASCII database is not recommanded, for several reasons : - the query parser might not work well with text values (because it will not know wether 1 UTF-8 letter is made of 1, 2 or 3 bytes). - server-side languages are multi-byte safe. If you calculate the lenght of an UTF-8 string in PLpgSQL stored in an ASCII database, it will probably fail for special characters. So, the answer is : 1) If you need to search and display multi-langual text, you need an UTF-8 database. You will be able to combine all languages in a single database : arabic, polish, japanese, etc... But, be aware that you will also need a full UTF-8 chain behind the database. Not all web servers are UTF-8 compliant... Your web pages will also need to be saved into UTF-8. Take PHP for example, you will need to enable the mb_string option at compilation. The recommanded way is to design your pages under GNU/Linux as it supports UTF-8 encoding very well. 2) If you need to search and display English or Western languages only, an ASCII-based database is enough. Stay tuned. The team will soon test pgAdmin3 UTF-8 compliance. As far as I can tell, I could browse UTF-8 data in pgAdmin3. Cheers, Jean-Michel
В списке pgadmin-support по дате отправления: