Re: tsearch parser inefficiency if text includes urls or emails - new version
От | Kevin Grittner |
---|---|
Тема | Re: tsearch parser inefficiency if text includes urls or emails - new version |
Дата | |
Msg-id | 4B20D4F1020000250002D2F1@gw.wicourts.gov обсуждение исходный текст |
Ответ на | Re: tsearch parser inefficiency if text includes urls or emails - new version (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: tsearch parser inefficiency if text includes
urls or emails - new version
|
Список | pgsql-hackers |
Andres Freund <andres@anarazel.de> wrote: > I think you see no real benefit, because your strings are rather > short - the documents I scanned when noticing the issue where > rather long. The document I used in the test which showed the regression was 672,585 characters, containing 10,000 URLs. > A rather extreme/contrived example: > postgres=# SELECT 1 FROM to_tsvector(array_to_string(ARRAY(SELECT > 'andres@anarazel.de http://www.postgresql.org/'::text FROM > generate_series(1, > 20000) g(i)), ' - ')); The most extreme of your examples uses a 979,996 character string, which is less than 50% larger than my test. I am, however, able to see the performance difference for this particular example, so I now have something to work with. I'm seeing some odd behavior in terms of when there is what sort of difference. Once I can categorize it better, I'll follow up. Thanks for the sample which shows the difference. -Kevin
В списке pgsql-hackers по дате отправления: