Re: improving GROUP BY estimation

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: improving GROUP BY estimation
Дата
Msg-id 20518.1459451903@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: improving GROUP BY estimation  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Ответы Re: improving GROUP BY estimation  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: improving GROUP BY estimation  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Список pgsql-hackers
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On 30 March 2016 at 14:03, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>> Attached is v4 of the patch

> Thanks, I think this is good to go, except that I think we need to use
> pow() rather than powl() because AIUI powl() is new to C99, and so
> won't necessarily be available on all supported platforms. I don't
> think we need worry about loss of precision, since that would only be
> an issue if rel->rows / rel->tuples were smaller than maybe 10^-14 or
> so, and it seems unlikely we'll get anywhere near that any time soon.

I took a quick look.  I concur with using pow() not powl(); the latter
is not in SUS v2 which is our baseline portability expectation, and in
fact there is *noplace* where we expect long double to work.  Moreover,
I don't believe that any of the estimates we're working with are so
accurate that a double-width power result would be a useful improvement.

Also, I wonder if it'd be a good idea to provide a guard against division
by zero --- we know rel->tuples > 0 at this point, but I'm less sure that
reldistinct can't be zero.  In the same vein, I'm worried about the first
argument of pow() being slightly negative due to roundoff error, leading
to a NaN result.

Maybe we should also consider clamping the final reldistinct estimate to
an integer with clamp_row_est().  The existing code doesn't do that but
it seems like a good idea on general principles.

Another minor gripe is the use of a random URL as justification.  This
code will still be around when that URL exists nowhere but the Wayback
Machine.  Can't we find a more formal citation to use?
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Recovery test failure for recovery_min_apply_delay on hamster
Следующее
От: Paul Ramsey
Дата:
Сообщение: Re: Parallel Queries and PostGIS