Re: More stable query plans via more predictable column statistics

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: More stable query plans via more predictable column statistics
Дата	9 марта 2016 г. 12:34:08
Msg-id	1457526834.24545.53.camel@2ndquadrant.com обсуждение исходный текст
Ответ на	Re: More stable query plans via more predictable column statistics ("Shulgin, Oleksandr" <oleksandr.shulgin@zalando.de>)
Ответы	Re: More stable query plans via more predictable column statistics Re: More stable query plans via more predictable column statistics
Список	pgsql-hackers

Дерево обсуждения

Hi,

On Wed, 2016-03-09 at 11:23 +0100, Shulgin, Oleksandr wrote:
> On Tue, Mar 8, 2016 at 8:16 PM, Alvaro Herrera
> <alvherre@2ndquadrant.com> wrote:
>         Shulgin, Oleksandr wrote:
>         
>         > Alright.  I'm attaching the latest version of this patch
>         split in two
>         > parts: the first one is NULLs-related bugfix and the second
>         is the
>         > "improvement" part, which applies on top of the first one.
>         
>         I went over patch 0001 and it seems pretty reasonable.  It's
>         missing
>         some comment updates -- at least the large comments that talk
>         about Duj1
>         should be modified to indicate why the code is now subtracting
>         the null
>         count.
> 
> 
> Good point.
>  
> 
>         Also, I can't quite figure out why the "else" now in line 2131
>         is now "else if track_cnt != 0".  What happens if track_cnt is
>         zero?
>         The comment above the "if" block doesn't provide any guidance.
> 
> 
> It is there only to avoid potentially dividing zero by zero when
> calculating avgcount (which will not be used after that anyway).  I
> agree it deserves a comment.

That definitely deserves a comment. It's not immediately clear why
(track_cnt != 0) would prevent division by zero in the code. The only
way such error could happen is if ndistinct==0, because that's the
actual denominator. Which means this
   ndistinct = ndistinct * sample_cnt

would have to evaluate to 0. But ndistinct==0 can't happen as we're in
the (nonnull_cnt > 0) branch, and that guarantees (standistinct != 0).

Thus the only possibility seems to be (nonnull_cnt==toowide_cnt). Why
not to use this condition instead?

FWIW while looking at the code I noticed that we skip wide varlena
values but not cstrings. Seems a bit suspicious.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: More stable query plans via more predictable column statistics