Re: fts, compond words?
От | Marcus Engene |
---|---|
Тема | Re: fts, compond words? |
Дата | |
Msg-id | 439D6F99.7070809@engene.se обсуждение исходный текст |
Ответ на | Re: fts, compond words? (Marcus Engene <mengpg@engene.se>) |
Ответы |
Re: fts, compond words?
|
Список | pgsql-general |
> That a simple case, what about languages as norwegian or german? They > has compound words and ispell dictionary can split them to lexemes. > But, usialy there is more than one variant of separation: > > forbruksvaremerkelov > forbruk vare merke lov > forbruk vare merkelov > forbruk varemerke lov > forbruk varemerkelov > forbruksvare merke lov > forbruksvare merkelov > (notice: I don't know translation, just an example. When we working on > compound word support we found word which has 24 variant of > separation!!) > > So, query 'a + forbruksvaremerkelov' will be awful: > > a + ( (forbruk & vare & merke & lov) | (forbruk & vare & merkelov) | ... ) > > Of course, that is examle just from mind, but solution of phrase > search should work reasonably with such corner cases. (Sorry for replying in the wrong place in the thread, I was away for a trip and unsubscribed meanwhile) I'm a swede and swedish is similair to norweigan and german. Take this example: lång hårig kvinna långhårig kvinna Words are put together to make a new word with different meaning. The first example means "tall hairy woman" and the second is "woman with long hair". If I would be on f.ex a date site, I'd want the distinction. ;-) If not, i should enter both strings ("lång hårig" | långhårig) & kvinna ...which is perfectly acceptable. IMHO I don't see any point in splitting these words. Let's go back to the subject, what about a syntax like this: idxfti @@ to_tsquery('default', 'pizza & (Chicago | [New York]') Ie the exact match string is always atomic. Wouldn't that be doable without any logical implications? Best regards, Marcus
В списке pgsql-general по дате отправления: