Re: planner support functions: handle GROUP BY estimates ?
От | Justin Pryzby |
---|---|
Тема | Re: planner support functions: handle GROUP BY estimates ? |
Дата | |
Msg-id | 20191226213250.GB12890@telsasoft.com обсуждение исходный текст |
Ответ на | Re: planner support functions: handle GROUP BY estimates ? (Justin Pryzby <pryzby@telsasoft.com>) |
Ответы |
Re: planner support functions: handle GROUP BY estimates ?
|
Список | pgsql-hackers |
On Sun, Dec 22, 2019 at 06:16:48PM -0600, Justin Pryzby wrote: > On Tue, Nov 19, 2019 at 01:34:21PM -0600, Justin Pryzby wrote: > > Tom implemented "Planner support functions": > > https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a391ff3c3d418e404a2c6e4ff0865a107752827b > > https://www.postgresql.org/docs/12/xfunc-optimization.html > > > > I wondered whether there was any consideration to extend that to allow > > providing improved estimates of "group by". That currently requires manually > > by creating an expression index, if the function is IMMUTABLE (which is not > > true for eg. date_trunc of timestamptz). > > I didn't hear back so tried implementing this for date_trunc(). Currently, the > I currently assume that the input data has 1 second granularity: ... > If the input timestamps have (say) hourly granularity, rowcount will be > *underestimated* by 3600x, which is worse than the behavior in master of > overestimating by (for "day") 24x. > > I'm trying to think of ways to address that: In the attached, I handled that by using histogram and variable's initial ndistinct estimate, giving good estimates even for intermediate granularities of input timestamps. |postgres=# DROP TABLE IF EXISTS t; CREATE TABLE t(i) AS SELECT a FROM generate_series(now(), now()+'11 day'::interval, '15minutes')a,generate_series(1,9)b; ANALYZE t; | |postgres=# explain analyze SELECT date_trunc('hour',i) i FROM t GROUP BY 1; | HashAggregate (cost=185.69..188.99 rows=264 width=8) (actual time=42.110..42.317 rows=265 loops=1) | |postgres=# explain analyze SELECT date_trunc('minute',i) i FROM t GROUP BY 1; | HashAggregate (cost=185.69..198.91 rows=1057 width=8) (actual time=41.685..42.264 rows=1057 loops=1) | |postgres=# explain analyze SELECT date_trunc('day',i) i FROM t GROUP BY 1; | HashAggregate (cost=185.69..185.83 rows=11 width=8) (actual time=46.672..46.681 rows=12 loops=1) | |postgres=# explain analyze SELECT date_trunc('second',i) i FROM t GROUP BY 1; | HashAggregate (cost=185.69..198.91 rows=1057 width=8) (actual time=41.816..42.435 rows=1057 loops=1)
Вложения
В списке pgsql-hackers по дате отправления: