Re: optimizer bent on full table scan
От | Greg Stark |
---|---|
Тема | Re: optimizer bent on full table scan |
Дата | |
Msg-id | 87smugxlet.fsf@stark.dyndns.tv обсуждение исходный текст |
Ответ на | Re: optimizer bent on full table scan (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-general |
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > I have a strange problem with the optimizer from CVS checked out as of about a > > week ago. It seems to be insisting on using a full table scan for a table > > lookup where the full table scan is about 20x slower than an index lookup. > > > I think somehow it's being confused by the fact that some values of the index > > are populated and others haven't been so the distribution is odd. > > Well, it's doing a heck of a good job of estimating the number of > matching rows --- can't complain about 8757 vs 8721. So there's some > other failure of modeling here. The only idea that comes to mind is > that maybe the rows matching foobar_id = 900 are tightly clustered in > the table, so that the planner's assumption of random fetches is overly > pessimistic. But the small correlation value says that there's not much > overall structure in the table's ordering. Can you shed any light on > that? Hm, that's hard to say. The table was originally populated by a job that loops through a sequential scan of the parent table around and inserts all the records for a given foobar_id in a single insert. So presumably those records would end up together though the foobar_id's might not be in sequential order which might skew the correlation. I guess correlation works well for range scans but isn't really a good substitute for measuring the "clustering" that's relevant for scans on =. But even so I'm surprised it's even close. The selectivity is about 2% and I've even lowered random_page_cost from the default, so it seems like even with very scattered records it would still be worthwhile using an index. slo=> select count(*) from foobartab; count -------- 406849 (1 row) Time: 31650.87 ms slo=> select count(*) from foobartab where foobar_id = 900; count ------- 8721 (1 row) Time: 5213.82 ms -- greg
В списке pgsql-general по дате отправления: