Re: ANALYZE sampling is too good
От | Tom Lane |
---|---|
Тема | Re: ANALYZE sampling is too good |
Дата | |
Msg-id | 24713.1386872980@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: ANALYZE sampling is too good (Jeff Janes <jeff.janes@gmail.com>) |
Ответы |
Re: ANALYZE sampling is too good
Re: ANALYZE sampling is too good |
Список | pgsql-hackers |
Jeff Janes <jeff.janes@gmail.com> writes: > It would be relatively easy to fix this if we trusted the number of visible > rows in each block to be fairly constant. But without that assumption, I > don't see a way to fix the sample selection process without reading the > entire table. Yeah, varying tuple density is the weak spot in every algorithm we've looked at. The current code is better than what was there before, but as you say, not perfect. You might be entertained to look at the threads referenced by the patch that created the current sampling method: http://www.postgresql.org/message-id/1tkva0h547jhomsasujt2qs7gcgg0gtvrp@email.aon.at particularly http://www.postgresql.org/message-id/flat/ri5u70du80gnnt326k2hhuei5nlnimonbs@email.aon.at#ri5u70du80gnnt326k2hhuei5nlnimonbs@email.aon.at However ... where this thread started was not about trying to reduce the remaining statistical imperfections in our existing sampling method. It was about whether we could reduce the number of pages read for an acceptable cost in increased statistical imperfection. regards, tom lane
В списке pgsql-hackers по дате отправления: