Re: Unicode database + JDBC driver performance
От | Barry Lind |
---|---|
Тема | Re: Unicode database + JDBC driver performance |
Дата | |
Msg-id | 3E074D5C.2050608@xythos.com обсуждение исходный текст |
Ответ на | Unicode database + JDBC driver performance (Jan Ploski <jpljpl@gmx.de>) |
Список | pgsql-general |
Jan, You say you are using 7.2.1, is that for both server and jdbc driver? There is a performance patch in the 7.3 driver that bypasses the built in java routines for converting to/from utf8 with a custom one. The built in java routines are very slow on some jdks (although on jdk1.4 they are pretty good). Can you try the 7.3 drivers? thanks, --Barry Jan Ploski wrote: > Hello, > > I have some questions regarding PostgreSQL handling of Unicode databases > and their performance. I am using version 7.2.1 and running two benchmarks > against a database set up with LATIN1 encoding and the same database > with UNICODE. The database consists of a single "test" table: > > Column | Type | Modifiers > --------+---------+----------- > id | integer | not null > txt | text | not null > Primary key: test_pkey > > The client is written in Java, it relies on the official JDBC driver, > and is being run on the same machine as the database. > > Benchmark 1: > > Insert 10,000 rows (in 10 transactions, 1000 rows per transaction) > into table "test". Each row contains 674 characters, most of which > are ASCII. > > Benchmark 2: > > select * from test, repeated 10 times in a loop > > > I am measuring the disk space taken by the database in each case > (LATIN1 vs UNICODE) and the time it takes to run the benchmarks. > I don't understand the results: > > Disk space change (after inserts and vacuumdb -f): > LATIN1 UNICODE > 764K 640K > > I would rather assume that the Unicode database takes more space, > even 2 times as more.. Apparently not (and that's nice). > > Avg. Benchmark execution times (obtained with the 'time' command, repeatedly): > Benchmark 1: > LATIN1 UNICODE > 11.5s 14.5s > > Benchmark 2: > LATIN1 UNICODE > 4.7s 8.6s > > The Unicode database is slower both on INSERTs and especially on > SELECTs. I am wondering why. Since Java uses Unicode internally, > shouldn't it actually be more efficient to store/retrieve character > data in that format, with no recoding? Maybe it is an issue with the > JDBC driver? Or is handling Unicode inherently much slower on the > backend side? > > Take care - > JPL > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
В списке pgsql-general по дате отправления: