Re: benchmarking the query planner
От | Greg Stark |
---|---|
Тема | Re: benchmarking the query planner |
Дата | |
Msg-id | 4136ffa0812121108i7b75490cq93f599adfd6f564a@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: benchmarking the query planner (Simon Riggs <simon@2ndQuadrant.com>) |
Список | pgsql-hackers |
On Fri, Dec 12, 2008 at 6:31 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > > Why not keep the random algorithm we have now, but scan the block into a > separate hash table for ndistinct estimation. That way we keep the > correct random rows for other purposes. It seems to me that what you have to do is look at a set of blocks and judge a) how many duplicates are in the typical block and b) how much overlap there are between blocks. Then extrapolate to other blocks based on those two values. So for example if you look at 1% of the blocks and find there are 27 distinct values on each of the blocks then you extrapolate that there are somewhere between 100*27 distinct values table-wide (if those blocks have no intersections) and 27 distinct values total (if those blocks intersect 100% -- ie, they all have the same 27 distinct values). I haven't had a chance to read that paper, it looks extremely dense. Is this the same idea? -- greg
В списке pgsql-hackers по дате отправления: