Обсуждение: [Fwd: binary tree query]
I have a biochemist telling me that this query below is a typical one for crawling through a taxonomic tree and that this is how I should represent some peptide information we have. Is there anyone on this list familiar with such data? I am weak in the science department but this query looks like it might not be the most efficient approach.
I have not been able to run an explain analyze yet as the database structure and data are not in place yet. We are just in the planning stages right now.
Any comments, suggestions, concerns, etc. would be much appreciated. Would an experienced DBA recommend a different approach? Can anyone offer some insight into the usefulness of INNER joins and the use of BETWEEN? I am concernec about performance as well since I expect this table to get large.
Thanks,
Jodi
I have not been able to run an explain analyze yet as the database structure and data are not in place yet. We are just in the planning stages right now.
Any comments, suggestions, concerns, etc. would be much appreciated. Would an experienced DBA recommend a different approach? Can anyone offer some insight into the usefulness of INNER joins and the use of BETWEEN? I am concernec about performance as well since I expect this table to get large.
SELECT count(*) FROM taxon_name INNER JOIN taxon AS tax_b USING(taxon_id) INNER JOIN taxon AS tax_v ON (tax_v.left_id BETWEEN tax_b.left_id AND tax_b.right_id ) INNER JOIN annot ON (tax_v.taxon_id = annot.taxon_id) INNER JOIN protein ON (protein.prot_id= annot.prot_id)WHERE annot.pref = 1 AND taxon_name.taxon_id=207245 ;
Thanks,
Jodi
--
_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu
Hi Jodi, I believe the technique they are using is representing a tree as nested sets. It requires that the database is built with properly nested left_id's and right_id's, and this technique is often regarded as a efficient means for retrieving hierachical information. Here is one reference about it: http://www.geocrawler.com/archives/3/6/2001/10/0/6961775/ As for the inner join clauses they are in effect analogous to where's but the optimizer handles them differently. That is where (pardon the pun) my understanding dwindles. Perhaps someone else has better insight about inner join's vs where's. Hope this sheds some light and not too many shadows, yuji ---- On Mon, 26 Jan 2004, Jodi Kanter wrote: > I have a biochemist telling me that this query below is a typical one > for crawling through a taxonomic tree and that this is how I should > represent some peptide information we have. Is there anyone on this list > familiar with such data? I am weak in the science department but this > query looks like it might not be the most efficient approach. > I have not been able to run an explain analyze yet as the database > structure and data are not in place yet. We are just in the planning > stages right now. > > Any comments, suggestions, concerns, etc. would be much appreciated. > Would an experienced DBA recommend a different approach? Can anyone > offer some insight into the usefulness of INNER joins and the use of > BETWEEN? I am concernec about performance as well since I expect this > table to get large. > > SELECT count(*) > FROM taxon_name > INNER JOIN taxon AS tax_b USING(taxon_id) > INNER JOIN taxon AS tax_v ON (tax_v.left_id BETWEEN > tax_b.left_id AND tax_b.right_id ) > INNER JOIN annot ON (tax_v.taxon_id = annot.taxon_id) > INNER JOIN protein ON (protein.prot_id= annot.prot_id) > WHERE annot.pref = 1 > AND taxon_name.taxon_id=207245 > ; > > > > Thanks, > Jodi > > -- > > /_______________________________ > //Jodi L Kanter > BioInformatics Database Administrator > University of Virginia > (434) 924-2846 > jkanter@virginia.edu <mailto:jkanter@virginia.edu>/ > > > > / / > > / / > > Yuji Shinozaki Computer Systems Senior Engineer ys2n@virginia.edu Advanced Technologies Group (434)924-7171 Information Technology & Communication http://www.people.virginia.edu/~ys2n University of Virginia
Thanks for the article. It did help some but I am still not sure if representing a tree is best served using a self referential table. Wouldn't it be better to separate it into several tables?
Can anyone else comment? Anyone have experience storing data that is hierarchical in nature (e.g. a tree formation).
Jodi
Yuji Shinozaki wrote:
Can anyone else comment? Anyone have experience storing data that is hierarchical in nature (e.g. a tree formation).
Jodi
Yuji Shinozaki wrote:
Hi Jodi, I believe the technique they are using is representing a tree as nested sets. It requires that the database is built with properly nested left_id's and right_id's, and this technique is often regarded as a efficient means for retrieving hierachical information. Here is one reference about it: http://www.geocrawler.com/archives/3/6/2001/10/0/6961775/ As for the inner join clauses they are in effect analogous to where's but the optimizer handles them differently. That is where (pardon the pun) my understanding dwindles. Perhaps someone else has better insight about inner join's vs where's. Hope this sheds some light and not too many shadows, yuji ---- On Mon, 26 Jan 2004, Jodi Kanter wrote:I have a biochemist telling me that this query below is a typical one for crawling through a taxonomic tree and that this is how I should represent some peptide information we have. Is there anyone on this list familiar with such data? I am weak in the science department but this query looks like it might not be the most efficient approach. I have not been able to run an explain analyze yet as the database structure and data are not in place yet. We are just in the planning stages right now. Any comments, suggestions, concerns, etc. would be much appreciated. Would an experienced DBA recommend a different approach? Can anyone offer some insight into the usefulness of INNER joins and the use of BETWEEN? I am concernec about performance as well since I expect this table to get large. SELECT count(*) FROM taxon_name INNER JOIN taxon AS tax_b USING(taxon_id) INNER JOIN taxon AS tax_v ON (tax_v.left_id BETWEEN tax_b.left_id AND tax_b.right_id ) INNER JOIN annot ON (tax_v.taxon_id = annot.taxon_id) INNER JOIN protein ON (protein.prot_id= annot.prot_id)WHERE annot.pref = 1 AND taxon_name.taxon_id=207245 ; Thanks, Jodi -- /_______________________________ //Jodi L Kanter BioInformatics Database Administrator University of Virginia (434) 924-2846 jkanter@virginia.edu <mailto:jkanter@virginia.edu>/ / / / /Yuji Shinozaki Computer Systems Senior Engineer ys2n@virginia.edu Advanced Technologies Group (434)924-7171 Information Technology & Communication http://www.people.virginia.edu/~ys2n University of Virginia ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster
--
_______________________________
Jodi L Kanter
BioInformatics Database Administrator
University of Virginia
(434) 924-2846
jkanter@virginia.edu