Re: [HACKERS] Make ANALYZE more selective about what is a "mostcommon value"?
От | Alvaro Herrera |
---|---|
Тема | Re: [HACKERS] Make ANALYZE more selective about what is a "mostcommon value"? |
Дата | |
Msg-id | 20170611034235.v6xfsdrfzgilrsnh@alvherre.pgsql обсуждение исходный текст |
Ответ на | Re: [HACKERS] Make ANALYZE more selective about what is a "mostcommon value"? (Gavin Flower <GavinFlower@archidevsys.co.nz>) |
Список | pgsql-hackers |
Gavin Flower wrote: > The standard deviation (sd) is proportional to the square root of > the number in the sample in a Normal Distribution. > > In a Normal Distribution, about 2/3 the values will be within plus > or minus one sd of the mean. > > There seems to be an implicit assumption that the distribution of > values follows the Normal Distribution - has this been verified? The whole problem here is precisely to determine what is the data distribution -- one side of it is how to represent it for the planner (which we do by storing a number of distinct values, a list of MCVs and their respective frequencies, and a histogram representing values not in the MCV list); the other side is how to figure out what data to put in the MCV list and histogram (i.e. what to compute during ANALYZE). If we knew the distribution was a normal, we wouldn't need any of these things -- we'd just store the mean and standard deviation. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: