Re: Full text search bug ('russian' regconfig)
От | egocenter |
---|---|
Тема | Re: Full text search bug ('russian' regconfig) |
Дата | |
Msg-id | 9210322148.20200220112124@yandex.ru обсуждение исходный текст |
Ответ на | Re: Full text search bug ('russian' regconfig) (Artur Zakirov <zaartur@gmail.com>) |
Список | pgsql-bugs |
Hello, Artur! Thanks for the answer, ok, it's strange that only 1 word is affected that way (as if two lexemes exist for 1 word)... *I use double to_tsvector to eliminate words duplicates. in the example below ts_title = 'histori':2 'watcom':1,3 and it gives 2 entries in 'город - watcom' via ts_rank_cd I need to count UNIQUE words entries but it seems to be no luck with std functionality (I see 2 ways: custom ts_rank function OR to_tsvector / edit tsvector and leave only first position for 'watcom': ts_title = 'histori':2 'watcom':1). If you have any idea on that situation, I would highly appreciate it! Thanks in advance) --------- SELECT round((ts_rank_cd(ts_title, web_query_or)/0.1)::NUMERIC, 0) AS title_entries_count, -- 2, but should be 1 * FROM (SELECT to_tsvector('russian', 'watcom history | watcom') AS ts_title, websearch_to_tsquery('russian', REPLACE('город - watcom', '- ' , '')) AS web_query_and, -- тире заменено для отменыего конвертации в отрицание ! REPLACE(websearch_to_tsquery(:reg_config, REPLACE('город - watcom', '- ' , ''))::TEXT, '&', '|')::tsquery AS web_query_or ) AS main; -- > Hello > On 2/19/2020 5:35 PM, egocenter wrote: >> Text search doesn't work correct with the EQUAL string in text and query (russian dictionary config), >> as you can see in example ts_vector receives different from ts_query lexemes for identical text: >> >> tsv = 'дан':1 'магазин':2 'нужн':3 'посеща':4 'точн':5 >> tsq = 'нужн' & 'точн' & 'дан' & 'посещаем' & 'магазин' > It is because you call to_tsvector() two times. 'russian' is a Snowball > dictionary and it uses stemming algorithms to cut words ending. Your > query works if to_tsvector() isn't called twice on the same text: > =# SELECT > web_query_and @@ ts_title, > web_query_and @@ 'зачем нужны точные данные о посещаемости магазинов', > * > FROM > (SELECT > to_tsvector('russian', 'зачем нужны точные данные о посещаемости > магазинов') AS ts_title, > websearch_to_tsquery('russian', 'зачем нужны точные данные о > посещаемости магазинов?') AS web_query_and > ) AS main; > It gives 'true' for the first column.
В списке pgsql-bugs по дате отправления: