Re: Inefficient bytea escaping?
От | Andreas Pflug |
---|---|
Тема | Re: Inefficient bytea escaping? |
Дата | |
Msg-id | 4475F95C.8080307@pse-consulting.de обсуждение исходный текст |
Ответ на | Re: Inefficient bytea escaping? (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Inefficient bytea escaping?
|
Список | pgsql-hackers |
Tom Lane wrote: > Andreas Pflug <pgadmin@pse-consulting.de> writes: > >>When dumping the table with psql \copy (non-binary), the resulting file >>would be 6.6GB of size, taking about 5.5 minutes. Using psql \copy WITH >>BINARY (modified psql as posted to -patches), the time was cut down to >>21-22 seconds (filesize 1.4GB as expected), which is near the physical >>throughput of the target disk. If server based COPY to file is used, The >>same factor 12 can be observed, CPU is up to 100 % (single P4 3GHz 2MB >>Cache HT disabled, 1GB main mem). > > > This is with an 8.0.x server, right? I've tested both 8.0.5 and 8.1.4, no difference observed. > Testing a similar case with CVS HEAD, I see about a 5x speed difference, > which is right in line with the difference in the physical amount of > data written. That's what I would have expected, apparently the data is near worst case. (I was testing a case where all the bytes were emitted as > '\nnn', so it's the worst case.) oprofile says the time is being spent > in CopyAttributeOutText() and fwrite(). So I don't think there's > anything to be optimized here, as far as bytea goes: its binary > representation is just inherently a lot smaller. Unfortunately, binary isn't the cure for all, since copying normal data with binary option might bloat that by factor two or so. I wish there was a third option that's fine for both kinds of data. That's not only a question of dump file sizes, but also of network throughput (an online compression in the line protocol would be desirable for this). > Looking at CopySendData, I wonder whether any traction could be gained > by trying not to call fwrite() once per character. I'm not sure how > much per-call overhead there is in that function. We've done a lot of > work trying to optimize the COPY IN path since 8.0, but nothing much > on COPY OUT ... Hm, I'll see whether I can manage to check CVS head too, and see what's happening, not a production alternative though. Regards, Andreas
В списке pgsql-hackers по дате отправления: