Tuplesort merge pre-reading
От | Heikki Linnakangas |
---|---|
Тема | Tuplesort merge pre-reading |
Дата | |
Msg-id | f298f77a-cf06-e70c-d5a4-a20b472b4627@iki.fi обсуждение исходный текст |
Ответы |
Re: Tuplesort merge pre-reading
Re: [HACKERS] Tuplesort merge pre-reading |
Список | pgsql-hackers |
While reviewing Peter's latest round of sorting patches, and trying to understand the new "batch allocation" mechanism, I started to wonder how useful the pre-reading in the merge stage is in the first place. I'm talking about the code that reads a bunch of from each tape, loading them into the memtuples array. That code was added by Tom Lane, back in 1999: commit cf627ab41ab9f6038a29ddd04dd0ff0ccdca714e Author: Tom Lane <tgl@sss.pgh.pa.us> Date: Sat Oct 30 17:27:15 1999 +0000 Further performance improvements in sorting: reduce number of comparisons during initial run formation by keeping both current run and next-run tuples in the same heap (yup, Knuth is smarter than I am). And, during merge passes, make use of available sort memory to load multiple tuples from any one input 'tape' at a time, thereby improving locality of access to the temp file. So apparently there was a benefit back then, but is it still worthwhile? The LogicalTape buffers one block at a time, anyway, how much gain are we getting from parsing the tuples into SortTuple format in batches? I wrote a quick patch to test that, attached. It seems to improve performance, at least in this small test case: create table lotsofints(i integer); insert into lotsofints select random() * 1000000000.0 from generate_series(1, 10000000); vacuum freeze; select count(*) FROM (select * from lotsofints order by i) t; On my laptop, with default work_mem=4MB, that select takes 7.8 s on unpatched master, and 6.2 s with the attached patch. So, at least in some cases, the pre-loading hurts. I think we should get rid of it. This patch probably needs some polishing: I replaced the batch allocations with a simpler scheme with a buffer to hold just a single tuple for each tape, and that might need some more work to allow downsizing those buffers if you have a few very large tuples in an otherwise narrow table. And perhaps we should free and reallocate a smaller memtuples array for the merging, now that we're not making use of the whole of it. And perhaps we should teach LogicalTape to use larger buffers, if we can't rely on the OS to do the buffering for us. But overall, this seems to make the code both simpler and faster. Am I missing something? - Heikki
Вложения
В списке pgsql-hackers по дате отправления: