Bytea network traffic: binary vs text result format
| От | Miha D. Puc |
|---|---|
| Тема | Bytea network traffic: binary vs text result format |
| Дата | |
| Msg-id | 466299A7.20403@eba.si обсуждение исходный текст |
| Ответы |
Re: Bytea network traffic: binary vs text result format
Re: Bytea network traffic: binary vs text result format |
| Список | pgsql-interfaces |
Hi! There was some debate recently about using text or binary format. There's people who would like to use it but have trouble converting binary encoded results into native types and there's people that say there's not much performance difference. I'd like to stress that performance is very different over slow network. The biggest difference is for byte where the text format performance is about 3.5 times worse at inserts and updates and about 2.9 times worse at selects . Here's the reasoning: In text format bytea are escaped using PQescapeBytea. In an average binary stream about 2/3 would be escaped. Each escaped byte becomes of form \\ooo at upload and of form \ooo for download, so the size of the escaped stream is 1/3 + 2/3 * 5 = 11/3 = 3.6 and 1/3 + 2/3 * 4 = 3 respectively. Here are the results of my test. I inserted and selected an OpenOffice document of size 2Mb over a 2M/512K cable. text format: insert: 120.1s select: 24.9s binary format: insert: 33.5s select: 8.6s factor: insert: 3.6 select: 2.9 The difference between the test and the above calculation comes from the estimate that 2/3 of bytes are escaped where in fact 95 out of 256 are escaped (63%). So there is a need (people asking) and reason (performance) to use binary format. But there's a huge drawback - the conversions. It's easy for varchar, not too bad for basic types (int, float, bool), effort is needed for timestamp, date, time and numeric is a pain. So with all the above there should be a utility for conversion between binary format and native types and/or string format in libpq. Regards, Miha Puc
В списке pgsql-interfaces по дате отправления: