Re: appendum: Re: *really* simple select doesn't use indices ...

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: appendum: Re: *really* simple select doesn't use indices ...
Дата
Msg-id 18366.991144487@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: appendum: Re: *really* simple select doesn't use indices ...  (Gavin Sherry <swm@linuxworld.com.au>)
Ответы Re: appendum: Re: *really* simple select doesn't use indices ...  (The Hermit Hacker <scrappy@hub.org>)
Список pgsql-hackers
Gavin Sherry <swm@linuxworld.com.au> writes:
> The column 'zip' is of type text. As such, indices will not be used except
> in the case when the where clause is WHERE zip ~ '^<text>' for btree
> indices.

Uh ... nonsense.

> On Tue, 29 May 2001, Marc G. Fournier wrote:
>> globalmatch=# vacuum verbose analyze locations;
>> NOTICE:  --Relation locations--
>> NOTICE:  Pages 1395: Changed 0, reaped 0, Empty 0, New 0; Tup 123571: Vac 0, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen
76,MaxLen 124; Re-using: Free/Avail. Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 0.11s/0.00u sec.
 

>> globalmatch=# explain SELECT location from locations WHERE zip = '80012';
>> NOTICE:  QUERY PLAN:
>> 
>> Seq Scan on locations  (cost=0.00..2939.64 rows=4217 width=16)

Okay, so it thinks that "zip = '80012'" will match 4217 out of 123571
rows, which is more than enough to drive it to a sequential scan
(with an average of more than three matched rows on every page of the
relation, there'd be no I/O savings at all from consulting the index).

Since the real number of matches is only 1, this estimate is obviously
way off.  In 7.1 the estimate is being driven by the frequency of the
most common value in the column --- what is the most common value?
If you're lucky, the most common value is a dummy (empty string, maybe)
that you could replace by NULL with a few simple changes in application
logic.  7.1 is smart enough to distinguish NULL from real data values
in its estimates.  If you're not lucky, there really are a few values
that are far more common than average, in which case you're stuck unless
you want to run development sources.  Current sources should do a lot
better on that kind of data distribution.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Hannu Krosing
Дата:
Сообщение: problems setting shared memory on linux
Следующее
От: The Hermit Hacker
Дата:
Сообщение: Re: appendum: Re: *really* simple select doesn't use indices ...