Re: Large table search question

Поиск

Список

Период

Сортировка

От	Greg Stark
Тема	Re: Large table search question
Дата	1 июня 2004 г. 13:26:01
Msg-id	874qpv8czt.fsf@stark.xeocode.com обсуждение исходный текст
Ответ на	Re: Large table search question (Richard Huxton <dev@archonet.com>)
Ответы	Re: Large table search question
Список	pgsql-general

Дерево обсуждения

Richard Huxton <dev@archonet.com> writes:

> If you execute a lot of queries for last_name="Smith" AND first_name="John"
> then it might well help, there are a lot of "Smith"s to choose from.

I think this is the crux of the argument. Even for a common last name like
Smith, you're going to be talking about what, a few thousand records? Probably
selective enough that the extra column isn't really necessary and you pay the
cost for that extra column on every single update and even on other lookups.

On the other hand this logic applies best to DSS style databases where you're
looking for the fastest average throughput. For OLTP databases it may not
hold: if 'Smith' breaks your performance guarantee then the application could
break. For systems like that it may be worth paying a 1% penalty everywhere
that's within your time budget to avoid paying a 100% penalty on the rare
query that would cause failures, even if on average that means a performance
loss.

In practice I find two column indexes are not uncommon, especially on
many-to-many relationship tables. Three column indexes are rare but they do
happen. Offhand I don't ever recall defining any indexes with four or more
columns.

> There's really no alternative to testing. The statistics tables are very useful
> here. Unless you have good reason not to, always turn the statistics gathering
> on, and take snapshot regularly to keep an eye on where PG is exerting the most
> effort.

IMHO many people rely too heavily on empirical data rather than having a good
sense of what they want to be happening. Statistics can obscure situations
like what I described above.

I do have statistics on though, and have the same thinking about always
leaving it on, but I'm unclear how to make use of these data. What tools
should I be looking at to use them?

--
greg

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Large table search question