Re: Slow query with a lot of data

Поиск

Список

Период

Сортировка

От	Matthew Wakeling
Тема	Re: Slow query with a lot of data
Дата	18 августа 2008 г. 13:05:17
Msg-id	alpine.DEB.1.10.0808181657350.4454@aragorn.flymine.org обсуждение исходный текст
Ответ на	Re: Slow query with a lot of data (Moritz Onken <onken@houseofdesign.de>)
Ответы	Re: Slow query with a lot of data
Список	pgsql-performance

Дерево обсуждения

On Mon, 18 Aug 2008, Moritz Onken wrote:
> Running the query for more than one user is indeed not much slower. That's
> what I need. I'm clustering the results table on domain right now. But why is
> this better than clustering it on "user"?

The reason is the way that the merge join algorithm works. What it does is
takes two tables, and sorts them both by the join fields. Then it can
stream through both tables producing results as it goes. It's the best
join algorithm, but it does require both tables to be sorted by the same
thing, which is domain in this case. The aggregating on user happens after
the join has been done, and the hash aggregate can accept the users in
random order.

If you look at your last EXPLAIN, see that it has to sort the result table
on domain, although it can read the domain_categories in domain order due
to the clustered index.

"HashAggregate
"  ->  Merge Join
"        Merge Cond: (b.domain = a.domain)"
"        ->  Index Scan using domain_categories_domain on domain_categories b
"        ->  Sort
"              Sort Key: a.domain"
"              Sort Method:  quicksort  Memory: 27kB"
"              ->  Index Scan using result_user_idx on result a
"                    Index Cond: ("user" = 1337)"

Without the user restriction and re-clustering, this should become:

"HashAggregate
"  ->  Merge Join
"        Merge Cond: (b.domain = a.domain)"
"        ->  Index Scan using domain_categories_domain on domain_categories b
"        ->  Index Scan using result_domain on result a

Matthew

--
Vacuums are nothings. We only mention them to let them know we know
they're there.

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Slow query with a lot of data