Re: benchmarking the query planner
От | Gregory Stark |
---|---|
Тема | Re: benchmarking the query planner |
Дата | |
Msg-id | 87hc59s4qx.fsf@oxford.xeocode.com обсуждение исходный текст |
Ответ на | Re: benchmarking the query planner (Simon Riggs <simon@2ndQuadrant.com>) |
Ответы |
Re: benchmarking the query planner
Re: benchmarking the query planner |
Список | pgsql-hackers |
Simon Riggs <simon@2ndQuadrant.com> writes: > The amount of I/O could stay the same, just sample all rows on block. > Lifting the sample size will help large tables. Will it be perfect? No. > But I'll take "better" over "not working at all". That will just raise the table size at which the problems start. It'll still be a constant-sized sample. It will also introduce strange biases. For instance in a clustered table it'll think there are a lot more duplicates than there really are because it'll see lots of similar values. Incidentally we *do* do block sampling. We pick random blocks and then pick random records within those blocks. This was new in, uh, 7.4? 8.0? Sometime around then. It dramatically reduced the i/o requirements but there were long discussions of how to do it without introducing biases. > If we are going to quote literature we should believe all the > literature. We can't just listen to some people that did a few tests > with sample size, but then ignore the guy that designed the MS optimizer > and many others. I'm not sure what you're talking about regarding "some people that did a few tests". I looked around for the paper I keep referencing and can't find it on my laptop. I'll look for it online. But it included a section which was a survey of past results from other papers and the best results required stupidly large sample sizes to get anything worthwhile. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning
В списке pgsql-hackers по дате отправления: