Re: Question about maxTapes & selectnewtape & dumptuples
От | Heikki Linnakangas |
---|---|
Тема | Re: Question about maxTapes & selectnewtape & dumptuples |
Дата | |
Msg-id | 72fe610e-ecfb-48d4-b966-083c5bd20c9e@iki.fi обсуждение исходный текст |
Ответ на | Question about maxTapes & selectnewtape & dumptuples (Andy Fan <zhihuifan1213@163.com>) |
Ответы |
Re: Question about maxTapes & selectnewtape & dumptuples
|
Список | pgsql-hackers |
On 30/06/2024 12:48, Andy Fan wrote: > merge sorts requires all the tuples in each input are pre-sorted (1), > and in tuplesort.c, when the workmem is full, we dumptuples into > destTape. (2). it looks to me that we need a unlimited number of Tapes > if the number of input tuples is big enough. However we set a maxTapes > in 'inittapes' and we may use a pre-used tapes for storing new tuples in > 'selectnewtape'. Wouldn't this break the prerequisite (1)? > > selectnewtape(Tuplesortstate *state) > { > if (state->nOutputTapes < state->maxTapes) > {..} > else > { > /* > * We have reached the max number of tapes. Append to an existing > * tape. > */ > state->destTape = state->outputTapes[state->nOutputRuns % state->nOutputTapes]; > state->nOutputRuns++; > } > } > > > for example, at the first use of outputTapes[x], it stores (1, 3, 5, 7), > and later (2, 4, 6, 8) are put into it. so the overall of (1, 3, 5, 7, > 2, 4, 6, 8) are not sorted? Where did I go wrong? There's a distinction between "runs" and "tapes". Each "run" is sorted, but there can be multiple runs on a tape. In that case, multiple merge passes are needed. Note the call to markrunend() between the runs. In your example, the tape would look like (1, 3, 5, 7, <end marker>, 2, 4, 6, 8). -- Heikki Linnakangas Neon (https://neon.tech)
В списке pgsql-hackers по дате отправления: