Re: More stable query plans via more predictable column statistics

Поиск
Список
Период
Сортировка
От Shulgin, Oleksandr
Тема Re: More stable query plans via more predictable column statistics
Дата
Msg-id CACACo5RA9D+GZ3AjNLqTLRJtO4_L+Z8vAoiTXD60T2_DjLjePQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: More stable query plans via more predictable column statistics  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: More stable query plans via more predictable column statistics  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Wed, Mar 9, 2016 at 1:33 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
Hi,

On Wed, 2016-03-09 at 11:23 +0100, Shulgin, Oleksandr wrote:
> On Tue, Mar 8, 2016 at 8:16 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>
>         Also, I can't quite figure out why the "else" now in line 2131
>         is now "else if track_cnt != 0".  What happens if track_cnt is
>         zero?
>         The comment above the "if" block doesn't provide any guidance.
>
>
> It is there only to avoid potentially dividing zero by zero when
> calculating avgcount (which will not be used after that anyway).  I
> agree it deserves a comment.

That definitely deserves a comment. It's not immediately clear why
(track_cnt != 0) would prevent division by zero in the code. The only
way such error could happen is if ndistinct==0, because that's the
actual denominator. Which means this

    ndistinct = ndistinct * sample_cnt

would have to evaluate to 0. But ndistinct==0 can't happen as we're in
the (nonnull_cnt > 0) branch, and that guarantees (standistinct != 0).

Thus the only possibility seems to be (nonnull_cnt==toowide_cnt). Why
not to use this condition instead?

Yes, I now recall that my actual concern was that sample_cnt may calculate to 0 due to the latest condition above, but that also implies track_cnt == 0, and then we have a for loop there which will not run at all due to this, so I figured we can avoid calculating avgcount and running the loop altogether with that check.  I'm not opposed to changing the condition if that makes the code easier to understand (or dropping it altogether if calculating 0/0 is believed to be harmless anyway).

--
Alex

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: More stable query plans via more predictable column statistics
Следующее
От: Tom Lane
Дата:
Сообщение: Re: More stable query plans via more predictable column statistics