Re: query very slow when enable_seqscan=on
От | Tomasz Ostrowski |
---|---|
Тема | Re: query very slow when enable_seqscan=on |
Дата | |
Msg-id | 20060704083733.GA20102@batory.org.pl обсуждение исходный текст |
Ответ на | Re: query very slow when enable_seqscan=on (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: query very slow when enable_seqscan=on
|
Список | pgsql-bugs |
On Mon, 03 Jul 2006, Tom Lane wrote: > > -> Seq Scan on organization (cost=0.00..480.95 rows=1 width=4) (actual time=0.071..69.702 rows=1892 loops=1) > > Filter: ((organization_location)::text ~* 'warszawa'::text) > > How many rows are there in the organization table? About 9000. And about 6500 persons. "Warszawa" is a biggest city in Poland and a capital - many organizations are located there. > This is probably the fault of the pattern-selectivity heuristic: > it's far too optimistic about long match strings eliminating a lot > of rows. I think there's been some discussion of modifying that > logic but no one's really stepped up with a better idea. I think because there is no good solution to this - no statistical information is going to predict how much data will match a regular expression. Maybe in this situation an algorithm should be pessimistic - that it will return all rows, or all non-null rows or all rows no shorter than matching string (if it's a string and not for example regex like [abcdefghijklmnopqrstuvwxyz] which is long but will match basicaly everything). In my opinion it is better to overestimate most of the time than to risk underestimation by a factor of 1000 and more. For now I'm turning off seqscans. This is a second time I got terrible permormance with seqscans turned on because of bad estimation. And my database will probably fit in cache. Regards Tometzky -- ...although Eating Honey was a very good thing to do, there was a moment just before you began to eat it which was better than when you were... Winnie the Pooh
В списке pgsql-bugs по дате отправления: