Re: proposal : cross-column stats

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: proposal : cross-column stats
Дата	17 декабря 2010 г. 14:58:12
Msg-id	AANLkTin5h1hLoO_yjxjFiievipYMfL_aVFSRAkYV77_Q@mail.gmail.com обсуждение исходный текст
Ответ на	Re: proposal : cross-column stats (Tomas Vondra <tv@fuzzy.cz>)
Ответы	Re: proposal : cross-column stats
Список	pgsql-hackers

Дерево обсуждения

On Fri, Dec 17, 2010 at 12:58 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
> In the end, all they need to compute an estimate is number of distinct
> values for each of the columns (we already have that in pg_stats) and a
> number of distinct values for the group of columns in a query. They
> really don't need any multidimensional histogram or something like that.

I haven't read the paper yet (sorry) but just off the top of my head,
one possible problem here is that our n_distinct estimates aren't
always very accurate, especially for large tables.  As we've discussed
before, making them accurate requires sampling a significant
percentage of the table, whereas all of our other statistics can be
computed reasonably accurately by sampling a fixed amount of an
arbitrarily large table.  So it's possible that relying more heavily
on n_distinct could turn out worse overall even if the algorithm is
better.  Not sure if that's an issue here, just throwing it out
there...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: proposal : cross-column stats