On 22/01/2023 18:23, Justin Pryzby wrote:
> pg_stat_progress_analyze was added in v13 (a166d408e).
>
> For tables with inheritance children, do_analyze_rel() and
> acquire_sample_rows() are called twice. The first time through,
> pgstat_progress_start_command() has memset() the progress array to zero.
>
> But the 2nd time, ANALYZE_BLOCKS_DONE is already set from the previous
> call, and BLOCKS_TOTAL can be set to some lower value (and in any case a
> value unrelated to the pre-existing value of BLOCKS_DONE). So the
> progress report briefly shows a bogus combination of values and, with
> these assertions, fails regression tests in master and v13, unless
> BLOCKS_DONE is first zeroed.
Good catch!
I think the counts need do be reset even earlier, in
acquire_inherited_sample_rows(), at the same time that we update
PROGRESS_ANALYZE_CURRENT_CHILD_TABLE_RELID. See attached patch.
Otherwise, there's a brief moment where we have already updated the
child table ID, but the PROGRESS_ANALYZE_BLOCKS_TOTAL
PROGRESS_ANALYZE_BLOCKS_DONE still show the counts from the previous
child table. And if it's a foreign table, the FDW's sampling function
might not update the progress report at all, in which case the old
values will be displayed until the table is fully processed.
I appreciate the assertions you added, that made it easy to reproduce
the problem. I'm inclined to not commit that though. It seems like a
modularity violation for the code in backend_progress.c to have such
intimate knowledge of what the different counters mean.
--
Heikki Linnakangas
Neon (https://neon.tech)