Re: A Better Way? (Multi-Left Join Lookup)
От | David Johnston |
---|---|
Тема | Re: A Better Way? (Multi-Left Join Lookup) |
Дата | |
Msg-id | 00f201cd66bf$d3273080$79759180$@yahoo.com обсуждение исходный текст |
Ответ на | Re: A Better Way? (Multi-Left Join Lookup) (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: A Better Way? (Multi-Left Join Lookup)
|
Список | pgsql-general |
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Friday, July 20, 2012 4:47 PM > To: David Johnston > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] A Better Way? (Multi-Left Join Lookup) > > "David Johnston" <polobo@yahoo.com> writes: > > WITH > > full_set AS ( ) -- 8,500 records > > , sub_1 AS () -- also about 8,500 > > , sub_2 AS () -- maybe 5,000 > > , sub_3 AS () - - maybe 3,000 > > SELECT full_set.* > > , COALESCE(sub_1.field, FALSE) > > , COALESCE(sub_2.field, FALSE) > > , COALESCE(sub_2.field, FALSE) > > FROM full_set > > LEFT JOIN sub_1 > > LEFT JOIN sub_2 > > LEFT JOIN sub_3 > > > The performance of this query is exponential due to the fact that the > > sub-queries/CTEs are not indexed and so each subset has to be scanned > > completely for each record in the full set. > > Surely not. Neither merge nor hash joins require an index. What plan is > getting selected? Are you sure there's at most one match in each "sub" set > for each row in the "full" set? If you were getting a large number of matches > in some cases, the size of the result could balloon to something unfortunate > ... but we have not got enough information to know. > > regards, tom lane The final result, in this case would have 8,500 records AND sub_1.field would be TRUE for basically all of them and FALSE for the minimal remainder sub_2.field would be TRUE for 5,000 of them and FALSE for 3,500 of them sub_3.field would be TRUE for 3,000 of them and FALSE for 5,500 of them There is never, in reality, two records in a sub-table for a single record in the master table. It is possible a record exists in a sub-table but not in the main table but I do not care about those (thus the LEFT instead of a FULL OUTER JOIN). I have attached a scrubbed query and explain/analyze. Let me know if something more is needed. I have included two versions of the query, one using CTE and the other using mostly sub-selects. I had run ANALYZE on the pertinent tables but the CTE queries all perform quite quickly when run by themselves. In looking at the source tables for the data I did notice that I have not properly defined the relevant INDEXes as being UNIQUE. This applies to two of the sub-tables. The third sub-table requires the use of "DISTINCT". The joining columns with each set of data are unique when fed into the LEFT JOIN. The master CTE/Query is generated via a function call and it also generates unique keys for the LEFT JOIN. Thank you for your help! David J.
Вложения
В списке pgsql-general по дате отправления: