Re: Force a merge join?
От | Tom Lane |
---|---|
Тема | Re: Force a merge join? |
Дата | |
Msg-id | 25096.1021766907@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Force a merge join? (Doug Fields <dfields-pg-general@pexicom.com>) |
Список | pgsql-general |
Doug Fields <dfields-pg-general@pexicom.com> writes: > So that correlates pretty well with it's guesses as to the top 10: > {38,192,369,330,332,501,229,493,319,424} > But not so well as to their relative distributions: > {0.389667,0.123667,0.0156667,0.013,0.00933333,0.00933333,0.009,0.00866667,0.00833333,0.00833333} Very curious. I'd have expected it to recognize 192 as the most common value, given that actual distribution. Is the analyze result repeatable? > Or, if I could tell it to do a > more detailed sampling during ANALYZE. I could also tell it to keep more > than the top 10 in the statistics table (SET STATISTICS), but I'm not sure > what it would buy me, other than forcing a larger sample The sample size scales linearly with the SET STATISTICS target (more specifically, with the largest target among the columns being analyzed). I was just about to suggest that you try setting a larger target and see if the stats get better. > How much would I slow the ANALYZE statement, and more importantly, the > query optimizer, if I told it to keep statistics on the top 200 instead of > the default 10 values? 200 seems like overkill... 20 or 30 might well be enough. FWIW, I went back and loaded the test case into 7.2.1, and I still get the same estimates I showed from current sources. So there's something very fishy about your results. Don't know where to look for the cause of the discrepancy at the moment. regards, tom lane PS: sorry about the blank post before...
В списке pgsql-general по дате отправления: