Re: [HACKERS] CLUSTER command progress monitor
От | Tatsuro Yamada |
---|---|
Тема | Re: [HACKERS] CLUSTER command progress monitor |
Дата | |
Msg-id | 59B7D119.2000101@lab.ntt.co.jp обсуждение исходный текст |
Ответ на | Re: [HACKERS] CLUSTER command progress monitor (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: [HACKERS] CLUSTER command progress monitor
Re: [HACKERS] CLUSTER command progress monitor |
Список | pgsql-hackers |
On 2017/09/11 23:38, Robert Haas wrote: > On Sun, Sep 10, 2017 at 10:36 PM, Tatsuro Yamada > <yamada.tatsuro@lab.ntt.co.jp> wrote: >> Thanks for the comment. >> >> As you know, CLUSTER command uses SEQ SCAN or INDEX SCAN as a scan method by >> cost estimation. In the case of SEQ SCAN, these two phases not overlap. >> However, in INDEX SCAN, it overlaps. Therefore I created the phase of "scan >> heap and write new heap" when INDEX SCAN was selected. >> >> I agree that progress reporting for sort is difficult. So it only reports >> the phase ("sorting tuples") in the current design of progress monitor of >> cluster. >> It doesn't report counter of sort. > > Doesn't that make it almost useless? I would guess that scanning the > heap and writing the new heap would ordinarily account for most of the > runtime, or at least enough that you're going to want something more > than just knowing that's the phase you're in. Hmmm, Should I add a counter in tuplesort.c? (tuplesort_performsort()) I know that external merge sort takes a time than quick sort. I'll try investigating how to get a counter from external merge sort processing. Is this the right way? >>> The patch is getting the value reported as heap_tuples_total from >>> OldHeap->rd_rel->reltuples. I think this is pointless: the user can >>> see that value anyway if they wish. The point of the progress >>> counters is to expose things the user couldn't otherwise see. It's >>> also not necessarily accurate: it's only an estimate in the best case, >>> and may be way off if the relation has recently be extended by a large >>> amount. I think it's pretty important that we try hard to only report >>> values that are known to be accurate, because users hate (and mock) >>> inaccurate progress reports. >> >> Do you mean to use the number of rows by using below calculation instead >> OldHeap->rd_rel->reltuples? >> >> estimate rows = physical table size / average row length > > No, I mean don't report it at all. The caller can do that calculation > if they wish, without any help from the progress reporting machinery. I see. I'll remove that column on next patch. Regards, Tatsuro Yamada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
В списке pgsql-hackers по дате отправления: