Re: small table left outer join big table

Поиск

Список

Период

Сортировка

От	Gurjeet Singh
Тема	Re: small table left outer join big table
Дата	28 декабря 2010 г. 11:10:19
Msg-id	AANLkTi=_==09etYitq0Aiysa2BPScWZcq4WSq=mwm-3t@mail.gmail.com обсуждение исходный текст
Ответ на	small table left outer join big table (Jie Li <jay23jack@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Tue, Dec 28, 2010 at 5:13 AM, Jie Li <jay23jack@gmail.com> wrote:

Hi,

Please see the following plan:

postgres=# explain select * from small_table left outer join big_table using (id);
                                 QUERY PLAN
----------------------------------------------------------------------------
Hash Left Join (cost=126408.00..142436.98 rows=371 width=12)
   Hash Cond: (small_table.id = big_table.id)
   -> Seq Scan on small_table (cost=0.00..1.09 rows=9 width=8)
   -> Hash (cost=59142.00..59142.00 rows=4100000 width=8)
         -> Seq Scan on big_table (cost=0.00..59142.00 rows=4100000 width=8)
(5 rows)

Here I have a puzzle, why not choose the small table to build hash table? It can avoid multiple batches thus save significant I/O cost, isn't it?

We can perform this query in two phases:
1) inner join, using the small table to build hash table.
2) check whether each tuple in the hash table has matches before, which can be done with another flag bit

The only compromise is the output order, due to the two separate phases. Not sure whether the SQL standard requires it.

SQL standard does not require the result to be in any particular order unless an ORDER BY is used.

Regards,

--
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurjeet@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: small table left outer join big table