Re: genomic locus
От | Teodor Sigaev |
---|---|
Тема | Re: genomic locus |
Дата | |
Msg-id | 8995e58b-80a9-7f8a-f552-a12d77550a74@sigaev.ru обсуждение исходный текст |
Ответ на | Re: genomic locus (Gene Selkov <selkovjr@gmail.com>) |
Список | pgsql-hackers |
> I think I can wrangle this type into GiST just by tweaking consistent(), > union(), and picksplit(), if I manage to express my needs in C without breaking > too many things. My first attempt segfaulted. Actually, consistent() can determ actual query data type by strategy number. See examples in ltree, intarray > If all goes to plan, I will end up with an index tree partitioned by contig at > the top level and geometrically down from there. That will be as close as I can > get to an array of config-specific indices, without having to store data in > separate tables. > > What do you think of that? I have some doubt that you can distinguish root page, but it's possible to distinguish leaf pages, intarray and tsearch do that. Reading your plan, I found an idea for GIN: key for GIN is a pair of (contig, one genome position). So, any search for interset operation with be actually a range search from (contig, start) to (contig, end) > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > I have a low-level technical question. Because I can’t anticipate the maximum > length of contig names (and do not want to waste space), I have made the new > locus type a varlena, like this: > > #include "utils/varlena.h" > > typedef struct LOCUS > { > int32 l_len_; /* varlena header (do not touch directly!) */ > int32 start; > int32 end; > char contig[FLEXIBLE_ARRAY_MEMBER]; > } LOCUS; > > #define LOCUS_SIZE(str) (offsetof(LOCUS, contig) + sizeof(str)) sizeof? or strlen ? > > That flexible array member messes with me every time I need to copy it while > deriving a new locus object from an existing one (or from a pair). What I ended > up doing is this: > > LOCUS *l = PG_GETARG_LOCUS_P(0); > LOCUS *new_locus; > char *contig; > int size; > new_locus = (LOCUS *) palloc0(sizeof(*new_locus)); > contig = pstrdup(l->contig); // need this to determine the length of contig l->contig should be null-terminated for pstrdup, but if so, you don't need to pstrdup() it - you could use l->contig directly below. BTW, LOCUS_SIZE should add 1 byte for '\0' character in this case. > name at runtime > size = LOCUS_SIZE(contig); > SET_VARSIZE(new_locus, size); > strcpy(new_locus->contig, contig); > > Is there a more direct way to clone a varlena structure (possibly assigning an > differently-sized contig to it)? One that is also memory-safe? Store length of contig in LOCUS struct. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
В списке pgsql-hackers по дате отправления: