Statistics on key distribution (was: Re: order by and index path)
От | jwieck@debis.com (Jan Wieck) |
---|---|
Тема | Statistics on key distribution (was: Re: order by and index path) |
Дата | |
Msg-id | m0zTowS-000EBRC@orion.SAPserv.Hamburg.dsh.de обсуждение исходный текст |
Ответ на | Re: [HACKERS] Re: order by and index path ("Thomas G. Lockhart" <lockhart@alumni.caltech.edu>) |
Список | pgsql-hackers |
> > > We do not have this statistical information. So the whole > > thing is at this time academic. > > I recall that Commercial Ingres made the assumption that one row (or 1% > of rows? My memory of Ingres is fading :) would be returned from a > qualified query if no statistics were available to suggest otherwise. > > It did collect statistics on data distribution to try to help make those > optimizer choices. > > It may be reasonable to assume that if there is an index, then using it > with any qualified query would be a win. Since the alternative is to > decide to _not_ use an index, a decision for which we have no support > with existing statistics. It may be also reasonable to collect statistic information and use that to quantify the cost of an index scan. The vacuum cleaner scans all indices on a relation vacuum'd completely. And at that time it already knows the number of pages and tuples in the heap relation (has that in the vcrelstats). Based on this it could decide to take every n'th index tuple while scanning and drop them somewhere where other backends can find them. This would be the statistical information needed by the optimizer to estimate the real cost of an index scan. It is only of interest for big tables, where hopping from block to block will make an index scan a looser against a seqscan in a many row matching scan. So it's up to the optimizer do decide based on the # of pages if statistical information is really required for cost calculation. Having the final indexqual along with the statistical information it will be a little tricky to figure out how many rows it might return, but not impossible. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #======================================== jwieck@debis.com (Jan Wieck) #
В списке pgsql-hackers по дате отправления: