Re: Tuplesort merge pre-reading
От | Heikki Linnakangas |
---|---|
Тема | Re: Tuplesort merge pre-reading |
Дата | |
Msg-id | de211a24-edda-d8b3-567e-a1610eb721c6@iki.fi обсуждение исходный текст |
Ответ на | Re: Tuplesort merge pre-reading (Peter Geoghegan <pg@heroku.com>) |
Список | pgsql-hackers |
On 09/28/2016 07:11 PM, Peter Geoghegan wrote: > On Wed, Sep 28, 2016 at 5:04 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >>> Not sure that I understand. I agree that each merge pass tends to use >>> roughly the same number of tapes, but the distribution of real runs on >>> tapes is quite unbalanced in earlier merge passes (due to dummy runs). >>> It looks like you're always using batch memory, even for non-final >>> merges. Won't that fail to be in balance much of the time because of >>> the lopsided distribution of runs? Tapes have an uneven amount of real >>> data in earlier merge passes. >> >> >> How does the distribution of the runs on the tapes matter? > > The exact details are not really relevant to this discussion (I think > it's confusing that we simply say "Target Fibonacci run counts", > FWIW), but the simple fact that it can be quite uneven is. Well, I claim that the fact that the distribution of runs is uneven, does not matter. Can you explain why you think it does? > This is why I never pursued batch memory for non-final merges. Isn't > that what you're doing here? You're pretty much always setting > "state->batchUsed = true". Yep. As the patch stands, we wouldn't really need batchUsed, as we know that it's always true when merging, and false otherwise. But I kept it, as it seems like that might not always be true - we might use batch memory when building the initial runs, for example - and because it seems nice to have an explicit flag for it, for readability and debugging purposes. >>> I'm basically repeating myself here, but: I think it's incorrect that >>> LogicalTapeAssignReadBufferSize() is called so indiscriminately (more >>> generally, it is questionable that it is called in such a high level >>> routine, rather than the start of a specific merge pass -- I said so a >>> couple of times already). >> >> >> You can't release the tape buffer at the end of a pass, because the buffer >> of a tape will already be filled with data from the next run on the same >> tape. > > Okay, but can't you just not use batch memory for non-final merges, > per my initial approach? That seems far cleaner. Why? I don't see why the final merge should behave differently from the non-final ones. - Heikki
В списке pgsql-hackers по дате отправления: