Re: Weird indices

Поиск

Список

Период

Сортировка

От	Martijn van Oosterhout
Тема	Re: Weird indices
Дата	20 февраля 2001 г. 23:00:03
Msg-id	20010221145815.A31213@chiru.svana.org обсуждение исходный текст
Ответ на	Re: Weird indices (Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Ответы	Re: Weird indices Re: Weird indices
Список	pgsql-general

Дерево обсуждения

On Tue, Feb 20, 2001 at 05:02:22PM -0800, Stephan Szabo wrote:
>
> IIRC, There's something which is effectively :
> estimated rows = <most common value's frequency>*<fraction>
> I think fraction defaults to (is always?) 1/10 for the standard
> index type.  That's where the 50 comes from. And the frequency is
> probably from the last vacuum analyze.

Is there a way to change this fraction?

We have a table with over 1 million rows and the statistics Postgres gathers
are not particularly useful. There is not one (non-null) value that occurs
significantly more often than other values but the distribution looks a lot
like a 1/x curve I guess. The most common value occurs 5249 times but the
average is only 95, so Postgres chooses seq scan almost always. We actually
now set enable_seqscan=off in many areas of our code to speed it up to a
useful rate. (This table also happens to have an (accedental) clustering on
this column also).

What is the reasoning behind estimating like that? Why not just the average
or the average + 1 SD?

Another idea, is there a use for making a "cohesiveness" index. ie. if
you're looking X by looking up the index, on average, how many also matching
tuples will be in the next 8k (or whatever size). Since these are likely to
be in the cache the cost of retreival would be much lower. This would mean
that an index on a clustered column would have a much lower estimated cost
than an index on other columns. This would make clustering more useful.

I think I'll stop rambling now...

Martijn

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Weird indices