Re: Bug in UTF8-Validation Code?
От | Andrew Dunstan |
---|---|
Тема | Re: Bug in UTF8-Validation Code? |
Дата | |
Msg-id | 45FC7513.8040206@dunslane.net обсуждение исходный текст |
Ответ на | Re: Bug in UTF8-Validation Code? (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> Here are some timing tests in 1m rows of random utf8 encoded 100 char >> data. It doesn't look to me like the saving you're suggesting is worth >> the trouble. >> > > Hmm ... not sure I believe your numbers. Using a test file of 1m lines > of 100 random latin1 characters converted to utf8 (thus, about half and > half 7-bit ASCII and 2-byte utf8 characters), I get this in SQL_ASCII > encoding: > > regression=# \timing > Timing is on. > regression=# create temp table test(f1 text); > CREATE TABLE > Time: 5.047 ms > regression=# copy test from '/home/tgl/zzz1m'; > COPY 1000000 > Time: 4337.089 ms > > and this in UTF8 encoding: > > utf8=# \timing > Timing is on. > utf8=# create temp table test(f1 text); > CREATE TABLE > Time: 5.108 ms > utf8=# copy test from '/home/tgl/zzz1m'; > COPY 1000000 > Time: 7776.583 ms > > The numbers aren't super repeatable, but it sure looks to me like the > encoding check adds at least 50% to the runtime in this example; so > doing it twice seems unpleasant. > > (This is CVS HEAD, compiled without assert checking, on an x86_64 > Fedora Core 6 box.) > > Are you comparing apples with apples? The db is utf8 in both of my cases. cheers andrew
В списке pgsql-hackers по дате отправления: