Re: Tuplesort merge pre-reading
От | Heikki Linnakangas |
---|---|
Тема | Re: Tuplesort merge pre-reading |
Дата | |
Msg-id | 0c0b80fc-9dea-c031-ce51-2781edefad4d@iki.fi обсуждение исходный текст |
Ответ на | Re: Tuplesort merge pre-reading (Peter Geoghegan <pg@heroku.com>) |
Ответы |
Re: Tuplesort merge pre-reading
|
Список | pgsql-hackers |
On 09/28/2016 06:05 PM, Peter Geoghegan wrote: > On Thu, Sep 15, 2016 at 9:51 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> I don't think it makes much difference in practice, because most merge >> passes use all, or almost all, of the available tapes. BTW, I think the >> polyphase algorithm prefers to do all the merges that don't use all tapes >> upfront, so that the last final merge always uses all the tapes. I'm not >> 100% sure about that, but that's my understanding of the algorithm, and >> that's what I've seen in my testing. > > Not sure that I understand. I agree that each merge pass tends to use > roughly the same number of tapes, but the distribution of real runs on > tapes is quite unbalanced in earlier merge passes (due to dummy runs). > It looks like you're always using batch memory, even for non-final > merges. Won't that fail to be in balance much of the time because of > the lopsided distribution of runs? Tapes have an uneven amount of real > data in earlier merge passes. How does the distribution of the runs on the tapes matter? >> + usedBlocks = 0; >> + for (tapenum = 0; tapenum < state->maxTapes; tapenum++) >> + { >> + int64 numBlocks = blocksPerTape + (tapenum < remainder ? 1 : 0); >> + >> + if (numBlocks > MaxAllocSize / BLCKSZ) >> + numBlocks = MaxAllocSize / BLCKSZ; >> + LogicalTapeAssignReadBufferSize(state->tapeset, tapenum, >> + numBlocks * BLCKSZ); >> + usedBlocks += numBlocks; >> + } >> + USEMEM(state, usedBlocks * BLCKSZ); > > I'm basically repeating myself here, but: I think it's incorrect that > LogicalTapeAssignReadBufferSize() is called so indiscriminately (more > generally, it is questionable that it is called in such a high level > routine, rather than the start of a specific merge pass -- I said so a > couple of times already). You can't release the tape buffer at the end of a pass, because the buffer of a tape will already be filled with data from the next run on the same tape. - Heikki
В списке pgsql-hackers по дате отправления: