Re: Hash Anti Join performance degradation
От | Kevin Grittner |
---|---|
Тема | Re: Hash Anti Join performance degradation |
Дата | |
Msg-id | 4DDE21FE020000250003DD49@gw.wicourts.gov обсуждение исходный текст |
Ответ на | Re: Hash Anti Join performance degradation (Cédric Villemain <cedric.villemain.debian@gmail.com>) |
Ответы |
Re: Hash Anti Join performance degradation
|
Список | pgsql-performance |
Cédric Villemain<cedric.villemain.debian@gmail.com> wrote: > 2011/5/26 panam <panam@gmx.net>: >> "max_connections";"100" >> "work_mem";"1GB" Each connection can allocate work_mem, potentially several times. On a machines without hundreds of GB of RAM, that pair of settings could cause severe swapping. >> "Patholgical" query: >> >> select >> b.id, >> (SELECT >> m1.id >> FROM >> message m1 >> LEFT JOIN >> message m2 >> ON ( >> m1.box_id = m2.box_id >> AND m1.id < m2.id >> ) >> WHERE >> m2.id IS NULL >> AND m1.box_id = b.id) >> from >> box b > without explaining further why the antijoin has bad performance > without cluster, I wonder why you don't use this query : > > SELECT b.id, > max(m.id) > FROM box b, message m > WHERE m.box_id = b.id > GROUP BY b.id; > > looks similar and fastest. I think you would need a left join to actually get identical results: SELECT b.id, max(m.id) FROM box b LEFT JOIN message m ON m.box_id = b.id GROUP BY b.id; But yeah, I would expect this approach to be much faster. Rather easier to understand and harder to get wrong, too. -Kevin
В списке pgsql-performance по дате отправления: