Re: appendum: Re: *really* simple select doesn't use indices ...
От | The Hermit Hacker |
---|---|
Тема | Re: appendum: Re: *really* simple select doesn't use indices ... |
Дата | |
Msg-id | Pine.BSF.4.33.0105291110360.82504-100000@mobile.hub.org обсуждение исходный текст |
Ответ на | Re: appendum: Re: *really* simple select doesn't use indices ... (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On Tue, 29 May 2001, Tom Lane wrote: > Gavin Sherry <swm@linuxworld.com.au> writes: > > The column 'zip' is of type text. As such, indices will not be used except > > in the case when the where clause is WHERE zip ~ '^<text>' for btree > > indices. > > Uh ... nonsense. Oh good, I was worried there for a sec ... :) > > On Tue, 29 May 2001, Marc G. Fournier wrote: > >> globalmatch=# vacuum verbose analyze locations; > >> NOTICE: --Relation locations-- > >> NOTICE: Pages 1395: Changed 0, reaped 0, Empty 0, New 0; Tup 123571: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen76, MaxLen 124; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.11s/0.00u sec. > > >> globalmatch=# explain SELECT location from locations WHERE zip = '80012'; > >> NOTICE: QUERY PLAN: > >> > >> Seq Scan on locations (cost=0.00..2939.64 rows=4217 width=16) > > Okay, so it thinks that "zip = '80012'" will match 4217 out of 123571 > rows, which is more than enough to drive it to a sequential scan > (with an average of more than three matched rows on every page of the > relation, there'd be no I/O savings at all from consulting the index). > > Since the real number of matches is only 1, this estimate is obviously > way off. In 7.1 the estimate is being driven by the frequency of the > most common value in the column --- what is the most common value? If > you're lucky, the most common value is a dummy (empty string, maybe) > that you could replace by NULL with a few simple changes in > application logic. 7.1 is smart enough to distinguish NULL from real > data values in its estimates. If you're not lucky, there really are a > few values that are far more common than average, in which case you're > stuck unless you want to run development sources. Current sources > should do a lot better on that kind of data distribution. Hit it right on the mark: zip | cnt -------+------- | 8140300210 | 100211 | 1 Will look at the code and see what I can do abuot that NULL issue ... thanks :)
В списке pgsql-hackers по дате отправления: