Re: Comparing two slices within one table efficiently
От | chester c young |
---|---|
Тема | Re: Comparing two slices within one table efficiently |
Дата | |
Msg-id | 543743.27966.qm@web54307.mail.re2.yahoo.com обсуждение исходный текст |
Ответ на | Comparing two slices within one table efficiently ("Ken Simpson" <ksimpson@mailchannels.com>) |
Список | pgsql-sql |
> I have a table with the following simplified form: > > create table t ( > run_id integer, > domain_id integer, > mta_id integer, > attribute1 integer, > attribute2 integer, > unique(run_id, domain_id, mta_id) > ); > > The table has about 1 million rows with run_id=1, another 1 million > rows with run_id=2, and so on. > > I need to efficiently query the differences between "runs" - i.e. For > each (domain_id, mta_id) tuple in run 1, is there a coresponding > tuple in run 2 where either attribute1 or attribute2 have changed? > > The only way I have been able to think of doing this so far is an > o(n^2) search, which even with indexes takes a long time. e.g. > > select * from t t1 where exists (select 1 from t t2 where > t2.mta_id=t1.mta_id and t2.domain_id=t1.domain_id and (t2.attribute1 > != t1.attribute1 or t2.attribute2 != t1.attribute2) > > This query takes millenia... > first, add a change flag change_tf that is set through a trigger whether this record different from record in the previous run. second, create an index on domain and mta where change_tf, so you're only indexing changed records. this would allow you to find your changes very efficiently at the relatively small cost of adding one lookup and one extra index per insert. ____________________________________________________________________________________ Pinpoint customers who are looking for what you sell. http://searchmarketing.yahoo.com/
В списке pgsql-sql по дате отправления: