Re: Compression and on-disk sorting
От | Jim C. Nasby |
---|---|
Тема | Re: Compression and on-disk sorting |
Дата | |
Msg-id | 20060526163107.GD59464@pervasive.com обсуждение исходный текст |
Ответ на | Re: Compression and on-disk sorting ("Jim C. Nasby" <jnasby@pervasive.com>) |
Ответы |
Re: Compression and on-disk sorting
|
Список | pgsql-hackers |
I've done some more testing with Tom's recently committed changes to tuplesort.c, which remove the tupleheaders from the sort data. It does about 10% better than compression alone does. What's interesting is that the gains are about 10% regardless of compression, which means compression isn't helping very much on all the redundant header data, which I find very odd. And the header data is very redundant: bench=# select xmin,xmax,cmin,cmax,aid from accounts order by aid limit 1; xmin | xmax | cmin | cmax | aid --------+------+------+------+-----280779 | 0 | 0 | 0 | 1 (1 row) bench=# select xmin,xmax,cmin,cmax,aid from accounts order by aid desc limit 1; xmin | xmax | cmin | cmax | aid --------+------+------+------+-----------310778 | 0 | 0 | 0 | 300000000 (1 row) Makes sense, since pgbench loads the database via a string of COPY commands, each of which loads 10000 rows. Something else worth mentioning is that sort performance is worse with larger work_mem for all cases except the old HEAD, prior to the tuplesort.c changes. It looks like whatever was done to fix that will need to be adjusted/rethought pending the outcome of using compression. In any case, compression certainly seems to be a clear win, at least in this case. If there's interest, I can test this on some larger hardware, or if someone wants to produce a patch for pgbench that will load some kind of real data into accounts.filler, I can test that as well. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
В списке pgsql-hackers по дате отправления: