Re: [HACKERS] ANALYZE command progress checker
От | vinayak |
---|---|
Тема | Re: [HACKERS] ANALYZE command progress checker |
Дата | |
Msg-id | fc9354f1-87bb-67c3-9fc2-1fd8c5c6cdf0@lab.ntt.co.jp обсуждение исходный текст |
Ответ на | Re: [HACKERS] ANALYZE command progress checker (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: [HACKERS] ANALYZE command progress checker
(Ashutosh Sharma <ashu.coek88@gmail.com>)
|
Список | pgsql-hackers |
On 2017/03/17 10:38, Robert Haas wrote: > On Fri, Mar 10, 2017 at 2:46 AM, vinayak > <Pokale_Vinayak_q3@lab.ntt.co.jp> wrote: >> Thank you for reviewing the patch. >> >> The attached patch incorporated Michael and Amit comments also. > I reviewed this tonight. Thank you for reviewing the patch. > + /* Report compute index stats phase */ > + pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, > + > PROGRESS_ANALYZE_PHASE_COMPUTE_INDEX_STATS); > > Hmm, is there really any point to this? And is it even accurate? It > doesn't look to me like we are computing any index statistics here; > we're only allocating a few in-memory data structures here, I think, > which is not a statistics computation and probably doesn't take any > significant time. > > + /* Report compute heap stats phase */ > + pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, > + PROGRESS_ANALYZE_PHASE_COMPUTE_HEAP_STATS); > > The phase you label as computing heap statistics also includes the > computation of index statistics. I wouldn't bother separating the > computation of heap and index statistics; I'd just have a "compute > statistics" phase that begins right after the comment that starts > "Compute the statistics." Understood. Fixed in the attached patch. > > + /* Report that we are now doing index cleanup */ > + pgstat_progress_update_param(PROGRESS_ANALYZE_PHASE, > + PROGRESS_ANALYZE_PHASE_INDEX_CLEANUP); > > OK, this doesn't make any sense either. We are not cleaning up the > indexes here. We are just closing them and releasing resources. I > don't see why you need this phase at all; it can't last more than some > tiny fraction of a second. It seems like you're trying to copy the > exact same phases that exist for vacuum instead of thinking about what > makes sense for ANALYZE. Understood. I have removed this phase. > > + /* Report number of heap blocks scaned so far*/ > + pgstat_progress_update_param(PROGRESS_ANALYZE_HEAP_BLKS_SCANNED, > targblock); > > While I don't think this is necessarily a bad thing to report, I find > it very surprising that you're not reporting what seems to me like the > most useful thing here - namely, the number of blocks that will be > sampled in total and the number of those that you've selected. > Instead, you're just reporting the length of the relation and the > last-sampled block, which is a less-accurate guide to total progress. > There are enough counters to report both things, so maybe we should. > > + /* Report total number of sample rows*/ > + pgstat_progress_update_param(PROGRESS_ANALYZE_TOTAL_SAMPLE_ROWS, numrows); > > As an alternative to the previous paragraph, yet another thing you > could report is number of rows sampled out of total rows to be > sampled. But this isn't the way to do it: you are reporting the > number of rows you sampled only once you've finished sampling. The > point of progress reporting is to report progress -- that is, to > report this value as it's updated, not just to report it when ANALYZE > is practically done and the value has reached its maximum. Understood. I have reported number of rows sampled out of total rows to be sampled. I have not reported the number of blocks that will be sampled in total. So currect pg_stat_progress_analyze view looks like: =# \d+ pg_stat_progress_analyze View "pg_catalog.pg_stat_progress_analyze" Column | Type | Collation | Nullable | Default | Storage | Descripti on ------------------------+---------+-----------+----------+---------+----------+---------- --- pid | integer | | | | plain | datid | oid | | | | plain | datname | name | | | | plain | relid | oid | | | | plain | phase | text | | | | extended | num_target_sample_rows | bigint | | | | plain | num_rows_sampled | bigint | | | | plain | View definition: SELECT s.pid, s.datid, d.datname, s.relid, CASE s.param1 WHEN 0 THEN 'initializing'::text WHEN 1 THEN 'collecting sample rows'::text WHEN 2 THEN 'computing statistics'::text ELSE NULL::text END AS phase, s.param2 AS num_target_sample_rows, s.param3 AS num_rows_sampled FROM pg_stat_get_progress_info('ANALYZE'::text) s(pid, datid, relid, param1, param2, p aram3, param4, param5, param6, param7, param8, param9, param10) LEFT JOIN pg_database d ON s.datid = d.oid; Comment? > The documentation for this patch isn't very good, either. You forgot > to update the part of the documentation that says that VACUUM is the > only command that currently supports progress reporting, and the > descriptions of the phases and progress counters are brief and not > entirely accurate - e.g. heap_blks_scanned is not actually the number > of heap blocks scanned, because we are sampling, not reading all the > blocks. Understood. I have updated the documentation. I will also try to improve documentation. > When adding calls to the progress reporting functions, please consider > whether a blank line should be added before or after the new code, or > both, or neither. I'd say some blank lines are needed in a few places > where you didn't add them. +1. I have added blank lines in a few places. Regards, Vinayak Pokale -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Heikki LinnakangasДата:
Сообщение: [HACKERS] Re: Authentication tests, and plain 'password' authentication with aSCRAM verifier