Re: using Tsearch2 for chemical text
От | Oleg Bartunov |
---|---|
Тема | Re: using Tsearch2 for chemical text |
Дата | |
Msg-id | Pine.LNX.4.64.0707260950280.18739@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | using Tsearch2 for chemical text (Rajarshi Guha <rguha@indiana.edu>) |
Список | pgsql-general |
On Wed, 25 Jul 2007, Rajarshi Guha wrote: > Hi, I have a table with about 9M entries. The table has 2 fields: id and name > which are of serial and text types respectively. I have a ordinary index on > the text field which allows me to do searches in reasonable time. Most of my > searches are of the form > > select * from mytable where name ~ 'some text query' > > I know that the Tsearch2 module will let me have very efficient text > searches. But if I understand correctly, it's based on a language specific > dictionary. wrong ! it comes with some written human language dictionaries, but you can write your very own dictionaries. dictionary is just a C-program. > > My problem is that the name column contains names of chemicals. Now for many > cases this may simply be a number (1674-56-2) and in other cases it may be an > alphanumeric string (such as (-)O-acetylcarnitine or > 1,2-cis-dihydroxybenzoate). In some cases it is a well-known word (say viagra > or calcium chloride or pentathol). > > My question is: will Tsearch2 be able to handle this type of text? Or will it > be hampered by the fact that the bulk of the rows do not correspond to > ordinary English Oh, sure. See, for example, our dict_regex dictionary, we use for astronomical search. http://lynx.sao.ru/~karpov/software/postgres_dict_regex.html This is a work in progress, but it works. > > ------------------------------------------------------------------- > Rajarshi Guha <rguha@indiana.edu> > GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE > ------------------------------------------------------------------- > My Ethicator machine must have had a built-in moral > compromise spectral phantasmatron! I'm a genius." > -Calvin > > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-general по дате отправления: