Re: default_statistics_target WAS: max_wal_senders must die
От | Nathan Boley |
---|---|
Тема | Re: default_statistics_target WAS: max_wal_senders must die |
Дата | |
Msg-id | AANLkTikVAH5dM-CD0YJacCXuEg7Tz9eCUs0V0-eN9DvO@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: default_statistics_target WAS: max_wal_senders must die (Greg Stark <gsstark@mit.edu>) |
Ответы |
Re: default_statistics_target WAS: max_wal_senders must die
|
Список | pgsql-hackers |
> Robert explained why having more MCVs might be useful because we use > the frequency of the least common MCV as an upper bound on the > frequency of any value in the MCV. Where is that being used? The only non-MCV frequency estimate that I recall seeing is ( nrows - n_ndistinct_rows )/ndistinct. Obviously changing the number of mcv's affects this by lowering n_ndistinct_rows, but it's always pretty coarse estimate. > Binding the length of the MCV list to the size of the histogram is > arbitrary but so would any other value Wouldn't the best approach be to stop adding MCV's/histogram buckets when adding new ones doesn't decrease your prediction error 'substantially'? One very hacky threshold heuristic is to stop adding MCV's when a simple equality select ( SELECT col FROM table WHERE col == VALUE ) would switch the plan from an index to a sequential scan ( or vice versa, although with the current code this would never happen ). ie, if the non_mcv frequency estimate is 0.1% ( producing an index scan ), but adding the MCV gives us an estimate of 5% ( pbly producing a seq scan ) then add that value as an mcv. More sophisticated variations might also consider plan changes to very suboptimal joins; even more sophisticated would be to stop when the MAX( curr - optimal plan / optimal plan ) was below some threshold, say 20%, over a bunch of recently executed queries. A similar approach would work for histogram bins, except the queries of interest are inequality rather than equality selections. -Nathan
В списке pgsql-hackers по дате отправления: