Re: genomic locus
От | Teodor Sigaev |
---|---|
Тема | Re: genomic locus |
Дата | |
Msg-id | abf4c37d-a307-9a70-1cf4-f00ff32dd15d@sigaev.ru обсуждение исходный текст |
Ответ на | Re: genomic locus (Gene Selkov <selkovjr@gmail.com>) |
Список | pgsql-hackers |
Hi! > Problem 1. What is a union of ‘1:6000-7000’ and ‘X:10000-20000’? Intuitively, it > should be NULL, however, I am not sure the method allows for that; it was > developed for objects living in the same metric space. I have mechanistically > reproduce the indexing methods of seg, but the resulting index is broken. All > queries against an indexed table return a null result. You can use fixed-size signature for string part (contig?) of locus, it will help to find that string part by equlity or intersection of need it. Signature of inner key should be bitwise-ORed of all child keys. Looking on query example, seems, you need to build signature tree and, if signature is the same of two key, take into account start-end range. > Problem 2. While the intersection (overlap, &c.) of any two loci produces > obvious results, non-intersection does not. When I query for all loci not > overlapping ‘1:6000-7000’, I expect to find all non-overlapping loci on contig > 1. I don’t want the query to return anything from other contigs, because it is > obvious that features on different contigs do not overlap. I may be able to fix > that by making separate functions for non-overlaps and adding a constraint to > them, but that seems like a kludge. Seems like you have two different non-overlap operation... > Problem 3 (alternative to 1). I realize that any clustering can help build an > efficient index, no matter how bizarre. So I could, for example, ignore the > contigs altogether and build a single index tree, using only position > co-ordinates and pretending that all positions are on the same contig; the > question then is whether and how such lossy index will affect the ordering of > query results. Can I use a separate function for ordering? I have yet to make an > experiment. Not that this would be equivalent to indexing the attributes of a > composite type separately (if I understood it correctly). > Ordering and GiST aren't very close things :) You can try to have a look to KNN-search feature of GiST. https://www.pgcon.org/2010/schedule/events/227.en.html -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
В списке pgsql-hackers по дате отправления: