Re: Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

Поиск

Список

Период

Сортировка

От	Lawrence, Ramon
Тема	Re: Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets
Дата	19 февраля 2009 г. 12:37:14
Msg-id	6EEA43D22289484890D119821101B1DF28B35F@exchange20.mercury.ad.ubc.ca обсуждение исходный текст
Ответ на	Re: Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets ("Bryce Cutt" <pandasuit@gmail.com>)
Ответы	Re: Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets
Список	pgsql-hackers

Дерево обсуждения

________________________________

From: pgsql-hackers-owner@postgresql.org on behalf of Robert Haas
I think what we need here is some very simple testing to demonstrate
that this patch demonstrates a speed-up even when the inner side of
the join is a joinrel rather than a baserel.  Can you suggest a single
query against the skewed TPCH dataset that will result in two or more
multi-batch hash joins?  If so, it should be a simple matter to run
that query with and without the patch and verify that the former is
faster than the latter.

This query will have the outer relation be a joinrel rather than a baserel:

select count(*) from supplier, part, lineitem where l_partkey = p_partkey and s_suppkey = l_suppkey;

The approach collects statistics on the outer relation (not the inner relation) so the code had to have the ability to
determinea stats tuple on a joinrel in addition to a baserel. 

Joshua sent us some preliminary data with this query and others and indicated that we could post it.  He wanted time to
cleanit up and re-run some experiments, but the data is generally good and the algorithm performs as expected.  I have
attachedthis data to the post.  Note that the last set of data (although labelled as Z7) is actually an almost zero
skewdatabase and represents the worst-case for the algorithm (for most queries the optimization is not even used). 

--
Ramon Lawrence

Вложения

JoshuaTolleyData.xls

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets

Вложения