Re: new correlation metric
От | Tom Lane |
---|---|
Тема | Re: new correlation metric |
Дата | |
Msg-id | 11984.1225039446@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: new correlation metric (Martijn van Oosterhout <kleptog@svana.org>) |
Ответы |
Re: new correlation metric
|
Список | pgsql-hackers |
Martijn van Oosterhout <kleptog@svana.org> writes: > I think the code is in the right direction, but I think want you want > is some kind of estimate of "given I've looked for tuple X, how many > tuples in the next k pages are near this one". Unfortunatly I don't see > a way of calculating it other than a full simulation. I wonder whether we ought to rethink the problem entirely. In particular, the notion of associating correlation stats with particular columns doesn't seem amazingly useful when you get right down to it. We're wasting our time calculating the correlation of a column that has no index; and if the column is part of a multicolumn index it's not that easy to figure out what the index correlation is from the per-column numbers. Not to mention functional and partial indexes. So it occurs to me that maybe we should forget about per-column correlations altogether, and instead try directly to calculate *per index* correlations. You could do this now by doing an index-only scan and looking at the series of tuple block numbers that come back. However, there's no obvious way to do any sampling in that approach --- you can't read random subsets of the index without cooperation from the index AM. So the direction I'm headed in is to imagine that we should add an "analyze" entry point to the index AM API, with the definition that it will be called once per index per ANALYZE, and is in charge of putting useful stats into someplace or other. We might need to invent some other catalog besides pg_statistic if we want to represent per-index properties like correlation. A minimal solution would be to add a correlation column to pg_class or pg_index, but that doesn't scale well if you imagine that different index AMs might want different sorts of stats. regards, tom lane
В списке pgsql-hackers по дате отправления: