Re: Optimizer(?) off by factor of 3 ... ?
От | Marc G. Fournier |
---|---|
Тема | Re: Optimizer(?) off by factor of 3 ... ? |
Дата | |
Msg-id | 20020213091531.F19107-100000@mail1.hub.org обсуждение исходный текст |
Ответ на | Re: Optimizer(?) off by factor of 3 ... ? (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
Okay, I've 'saved' the dataset/schema for this if you want me to test/try anything further with it, as I'm going to try and redo teh schema to get around the issues for now ... On Mon, 11 Feb 2002, Tom Lane wrote: > "Marc G. Fournier" <scrappy@hub.org> writes: > > [ bogus optimizer choices in 7.2 ] > > Well, I guess the good news is we seem to be past the old bugaboo of bad > statistics: the estimated row counts are all in the right ballpark. Now > we get to have fun with the cost models :-). > > It looks to me like there are a couple of problems here. One is that > the default value of effective_cache_size is way too small --- it's set > at 1000, which is probably silly when you have NBuffers set to 32768. > (In hindsight maybe we should have expressed it as a multiple of > NBuffers rather than an absolute size.) You could tweak that with a > postgresql.conf change, but I'm not sure that that alone will help much. > > The more difficult issue is that nestloops with inner indexscan are > being seriously misestimated. We're computing the cost as though each > iteration of the inner scan were completely independent and being done > from a standing start --- which is wrong, because in practice scans > after the first will tend to find buffer cache hits for pages already > read in by prior scans. You can bet, for example, that the btree > metapage and root page aren't going to need to be re-read on each > iteration. > > I am thinking that the right way to do this is to cost the entire inner > indexscan (all iterations put together) as if it were a single > indexscan, at least for the purposes of applying the Mackert & Lohman > formula embedded in cost_index. That would give us a more realistic > result for the total cost of the main-table accesses driven by the > index. Not sure how to adjust the cost estimate for reading the index, > but clearly we need to make some adjustment for repeated hits on the > upper index pages. > > This is probably a bigger change than we can hope to make in 7.2.* ... > > BTW, what do you get if you EXPLAIN ANALYZE that orient/clubs join > with seqscan enabled and hashjoin disabled? If it's a mergejoin, > how about if you also disable mergejoin? It seems to me that a seqscan > on clubs would be a much better way to do the nestloop join than an > indexscan --- but it's being forced into an indexscan because you > disabled seqscan. > > regards, tom lane >
В списке pgsql-hackers по дате отправления: