Re: tsearch parser inefficiency if text includes urls or emails - new version

Поиск

Список

Период

Сортировка

От	Kevin Grittner
Тема	Re: tsearch parser inefficiency if text includes urls or emails - new version
Дата	10 декабря 2009 г. 13:01:24
Msg-id	4B20D4F1020000250002D2F1@gw.wicourts.gov обсуждение исходный текст
Ответ на	Re: tsearch parser inefficiency if text includes urls or emails - new version (Andres Freund <andres@anarazel.de>)
Ответы	Re: tsearch parser inefficiency if text includes urls or emails - new version
Список	pgsql-hackers

Дерево обсуждения

Andres Freund <andres@anarazel.de> wrote:
> I think you see no real benefit, because your strings are rather
> short - the documents I scanned when noticing the issue where
> rather long.
The document I used in the test which showed the regression was
672,585 characters, containing 10,000 URLs.
> A rather extreme/contrived example:
> postgres=# SELECT 1 FROM to_tsvector(array_to_string(ARRAY(SELECT 
> 'andres@anarazel.de http://www.postgresql.org/'::text FROM 
> generate_series(1, 
> 20000) g(i)), ' -  '));
The most extreme of your examples uses a 979,996 character string,
which is less than 50% larger than my test.  I am, however, able to
see the performance difference for this particular example, so I now
have something to work with.  I'm seeing some odd behavior in terms
of when there is what sort of difference.  Once I can categorize it
better, I'll follow up.
Thanks for the sample which shows the difference.
-Kevin

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: tsearch parser inefficiency if text includes urls or emails - new version