Re: Adding a suffix array index
От | Troels Arvin |
---|---|
Тема | Re: Adding a suffix array index |
Дата | |
Msg-id | pan.2004.11.19.12.55.48.92702@arvin.dk обсуждение исходный текст |
Ответ на | Adding a suffix array index (Troels Arvin <troels@arvin.dk>) |
Список | pgsql-hackers |
On Fri, 19 Nov 2004 14:38:20 +0200, Hannu Krosing wrote: >> Part of my current code concerns packing DNA characters: As the alphabet >> of DNA strings is very small (four characters), it seems like a >> straigt-forward optimization to store each character in two bits. > > My advice would be to get it to work first, oprimize later. Valid point. However, I needed something rather basic to work on, to get to know C and to get to know PostgreSQL in a user defined type context. But if packing proves to be a problem when implementing the interesting stuff, then thanks&yes: Packing should be an afterthought. >> My first and most immediate goal is to support efficient answering of a >> question like "which rows contain the sequence TTGACCACTTG in column foo?". > > If you store your sequences as strings, you may try to use trigrams (or > modify them to 4,5,6 or 7-grams ;) to get some feel how that works. > > trigram module is in contrib/pg_trgm. (/me Printing readme.) Thanks. -- Greetings from Troels Arvin, Copenhagen, Denmark
В списке pgsql-hackers по дате отправления: