Re: [HACKERS] CLUSTER command progress monitor

Поиск

Список

Период

Сортировка

От	Tatsuro Yamada
Тема	Re: [HACKERS] CLUSTER command progress monitor
Дата	12 сентября 2017 г. 15:20:41
Msg-id	59B7D119.2000101@lab.ntt.co.jp обсуждение исходный текст
Ответ на	Re: [HACKERS] CLUSTER command progress monitor (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: [HACKERS] CLUSTER command progress monitor Re: [HACKERS] CLUSTER command progress monitor
Список	pgsql-hackers

Дерево обсуждения

On 2017/09/11 23:38, Robert Haas wrote:
> On Sun, Sep 10, 2017 at 10:36 PM, Tatsuro Yamada
> <yamada.tatsuro@lab.ntt.co.jp> wrote:
>> Thanks for the comment.
>>
>> As you know, CLUSTER command uses SEQ SCAN or INDEX SCAN as a scan method by
>> cost estimation. In the case of SEQ SCAN, these two phases not overlap.
>> However, in INDEX SCAN, it overlaps. Therefore I created the phase of "scan
>> heap and write new heap" when INDEX SCAN was selected.
>>
>> I agree that progress reporting for sort is difficult. So it only reports
>> the phase ("sorting tuples") in the current design of progress monitor of
>> cluster.
>> It doesn't report counter of sort.
>
> Doesn't that make it almost useless?  I would guess that scanning the
> heap and writing the new heap would ordinarily account for most of the
> runtime, or at least enough that you're going to want something more
> than just knowing that's the phase you're in.

Hmmm, Should I add a counter in tuplesort.c? (tuplesort_performsort())
I know that external merge sort takes a time than quick sort.
I'll try investigating how to get a counter from external merge sort processing.
Is this the right way?


>>> The patch is getting the value reported as heap_tuples_total from
>>> OldHeap->rd_rel->reltuples.  I think this is pointless: the user can
>>> see that value anyway if they wish.  The point of the progress
>>> counters is to expose things the user couldn't otherwise see.  It's
>>> also not necessarily accurate: it's only an estimate in the best case,
>>> and may be way off if the relation has recently be extended by a large
>>> amount.  I think it's pretty important that we try hard to only report
>>> values that are known to be accurate, because users hate (and mock)
>>> inaccurate progress reports.
>>
>> Do you mean to use the number of rows by using below calculation instead
>> OldHeap->rd_rel->reltuples?
>>
>>   estimate rows = physical table size / average row length
>
> No, I mean don't report it at all.  The caller can do that calculation
> if they wish, without any help from the progress reporting machinery.

I see. I'll remove that column on next patch.


Regards,
Tatsuro Yamada



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] CLUSTER command progress monitor