Re: Parallel tuplesort (for parallel B-Tree index creation)
От | Claudio Freire |
---|---|
Тема | Re: Parallel tuplesort (for parallel B-Tree index creation) |
Дата | |
Msg-id | CAGTBQpYZVdfvu_xO_Jwe0Jz8EiSsJAWVa8Qks05GNbKidw8yVw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Parallel tuplesort (for parallel B-Tree index creation) (Claudio Freire <klaussfreire@gmail.com>) |
Ответы |
Re: Parallel tuplesort (for parallel B-Tree index creation)
Re: Parallel tuplesort (for parallel B-Tree index creation) Re: Parallel tuplesort (for parallel B-Tree index creation) |
Список | pgsql-hackers |
On Thu, Sep 8, 2016 at 2:18 PM, Claudio Freire <klaussfreire@gmail.com> wrote: > On Thu, Sep 8, 2016 at 2:13 PM, Peter Geoghegan <pg@heroku.com> wrote: >> On Thu, Sep 8, 2016 at 8:53 AM, Claudio Freire <klaussfreire@gmail.com> wrote: >>> setup: >>> >>> create table lotsofitext(i text, j text, w text, z integer, z2 bigint); >>> insert into lotsofitext select cast(random() * 1000000000.0 as text) >>> || 'blablablawiiiiblabla', cast(random() * 1000000000.0 as text) || >>> 'blablablawjjjblabla', cast(random() * 1000000000.0 as text) || >>> 'blablabl >>> awwwabla', random() * 1000000000.0, random() * 1000000000000.0 from >>> generate_series(1, 10000000); >>> >>> timed: >>> >>> select count(*) FROM (select * from lotsofitext order by i, j, w, z, z2) t; >>> >>> Unpatched Time: 100351.251 ms >>> Patched Time: 75180.787 ms >>> >>> That's like a 25% speedup on random input. As we say over here, rather >>> badly translated, not a turkey's boogers (meaning "nice!") >> >> Cool! What work_mem setting were you using here? > > The script iterates over a few variations of string patterns (easy > comparisons vs hard comparisons), work mem (4MB, 64MB, 256MB, 1GB, > 4GB), and table sizes (~350M, ~650M, ~1.5G). > > That particular case I believe is using work_mem=4MB, easy strings, 1.5GB table. Well, the worst regression I see is under the noise for this test (which seems rather high at 5%, but it's expectable since it's mostly big queries). Most samples show an improvement, either marginal or significant. The most improvement is, naturally, on low work_mem settings. I don't see significant slowdown on work_mem settings that should result in just a few tapes being merged, but I didn't instrument to check how many tapes were being merged in any case. Attached are the results both in ods, csv and raw formats. I think these are good results. So, to summarize the review: - Patch seems to follow the coding conventions of surrounding code - Applies cleanly on top of25794e841e5b86a0f90fac7f7f851e5d950e51e2, plus patches 1 and 2. - Builds without warnings - Passes regression tests - IMO has sufficient coverage from existing tests (none added) - Does not introduce any significant performance regression - Best improvement of 67% (reduction of runtime to 59%) - Average improvement of 30% (reduction of runtime to 77%) - Worst regression of 5% (increase of runtime to 105%), which is under the noise for control queries, so not significant - Performance improvement is highly quite desirable in this merge step, as it's a big bottleneck on parallel sort (and seems also regular sort) - All testing was done on random input, presorted input *will* show more pronounced improvements I suggested to change a few asserts in tuplesort_heap_root_displace to make the debug code stricter in checking the assumptions, but they're not blockers: + Assert(state->memtupcount > 1 || imin == 0); + memtuples[imin] = *newtup; Into + Assert(imin < state->memtupcount); + memtuples[imin] = *newtup; And, perhaps as well, + Assert(memtuples[0].tupindex == newtup->tupindex); + + CHECK_FOR_INTERRUPTS(); into + Assert(state->memtupcount > 0 && memtuples[0].tupindex == newtup->tupindex); + + CHECK_FOR_INTERRUPTS(); It was suggested that both tuplesort_heap_siftup and tuplesort_heap_root_displace could be wrappers around a common "siftup" implementation, since the underlying operation is very similar. Since it is true that doing so would make it impossible to keep the asserts about tupindex in tuplesort_heap_root_displace, I guess it depends on how useful those asserts are (ie: how likely it is that those conditions could be violated, and how damaging it could be if they were). If it is decided the refactor is desirable, I'd suggest making the common siftup producedure static inline, to allow tuplesort_heap_root_displace to inline and specialize it, since it will be called with checkIndex=False and that simplifies the resulting code considerably. Peter also mentioned that there were some other changes going on in the surrounding code that could impact this patch, so I'm marking the patch Waiting on Author. Overall, however, I believe the patch is in good shape. Only minor form issues need to be changed, the functionality seems both desirable and ready.
Вложения
В списке pgsql-hackers по дате отправления: