On 2017-Nov-21, Peter Geoghegan wrote:
> On Mon, Oct 2, 2017 at 6:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > Progress reporting on sorts seems like a tricky problem to me, as I
> > said before. In most cases, a sort is going to involve an initial
> > stage where it reads all the input tuples and writes out quicksorted
> > runs, and then a merge phase where it merges all the output tapes into
> > a sorted result. There are some complexities; for example, if the
> > number of tapes is really large, then we might need multiple merge
> > phases, only the last of which will produce tuples.
>
> This would ordinarily be the point at which I'd say "but you're very
> unlikely to require multiple passes for an external sort these days".
> But I won't say that on this thread, because CLUSTER generally has
> unusually wide tuples, and so is much more likely to be I/O bound, to
> require multiple passes, etc. (I bet the v10 enhancements
> disproportionately improved CLUSTER performance.)
When the seqscan-and-sort strategy is used, we feed tuplesort with every
tuple from the scan. Once that's completed, we call `performsort`, then
retrieve tuples.
If we see this in terms of tapes and merges, we can report the total
number of each of those that we have completed. As far as I understand,
we write one tape to completion, and only then start another one, right?
Since there's no way to know how many tapes/merges are needed in total,
it's not possible to compute a percentage of completion. That's seems
okay -- we're just telling the user that progress is being made, and we
only report facts not theory. Perhaps we can (also?) indicate disk I/O
utilization, in terms of the number of blocks written by tuplesort.
I suppose that in order to have tuplesort.c report progress, we would
have to have some kind of API that tuplesort would invoke internally to
indicate events such as "tape started/completed", "merge started/completed".
One idea is to use a callback system; each tuplesort caller could
optionally pass a callback to the "begin" function, for progress
reporting purposes. Initially only cluster.c would use it, but I
suppose eventually every tuplesort caller would want that.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services