Re: proposal : cross-column stats
От | Robert Haas |
---|---|
Тема | Re: proposal : cross-column stats |
Дата | |
Msg-id | AANLkTin5h1hLoO_yjxjFiievipYMfL_aVFSRAkYV77_Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: proposal : cross-column stats (Tomas Vondra <tv@fuzzy.cz>) |
Ответы |
Re: proposal : cross-column stats
|
Список | pgsql-hackers |
On Fri, Dec 17, 2010 at 12:58 PM, Tomas Vondra <tv@fuzzy.cz> wrote: > In the end, all they need to compute an estimate is number of distinct > values for each of the columns (we already have that in pg_stats) and a > number of distinct values for the group of columns in a query. They > really don't need any multidimensional histogram or something like that. I haven't read the paper yet (sorry) but just off the top of my head, one possible problem here is that our n_distinct estimates aren't always very accurate, especially for large tables. As we've discussed before, making them accurate requires sampling a significant percentage of the table, whereas all of our other statistics can be computed reasonably accurately by sampling a fixed amount of an arbitrarily large table. So it's possible that relying more heavily on n_distinct could turn out worse overall even if the algorithm is better. Not sure if that's an issue here, just throwing it out there... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: