Re: proposal : cross-column stats
От | Yeb Havinga |
---|---|
Тема | Re: proposal : cross-column stats |
Дата | |
Msg-id | 4D05EDCA.9070402@gmail.com обсуждение исходный текст |
Ответ на | Re: proposal : cross-column stats (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: proposal : cross-column stats
|
Список | pgsql-hackers |
On 2010-12-13 03:28, Robert Haas wrote: > Well, I'm not real familiar with contingency tables, but it seems like > you could end up needing to store a huge amount of data to get any > benefit out of it, in some cases. For example, in the United States, > there are over 40,000 postal codes, and some even larger number of > city names, and doesn't the number of entries go as O(m*n)? Now maybe > this is useful enough anyway that we should Just Do It, but it'd be a > lot cooler if we could find a way to give the planner a meaningful > clue out of some more compact representation. A sparse matrix that holds only 'implicative' (P(A|B) <> P(A*B)?) combinations? Also, some information might be deduced from others. For Heikki's city/region example, for each city it would be known that it is 100% in one region. In that case it suffices to store only that information, since 0% in all other regions ca be deduced. I wouldn't be surprized if storing implicatures like this would reduce the size to O(n). regards, Yeb Havinga
В списке pgsql-hackers по дате отправления: