Re: [GENERAL] Creation of tsearch2 index is very slow
От | Martijn van Oosterhout |
---|---|
Тема | Re: [GENERAL] Creation of tsearch2 index is very slow |
Дата | |
Msg-id | 20060120223327.GH31908@svana.org обсуждение исходный текст |
Ответ на | Re: [GENERAL] Creation of tsearch2 index is very slow (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-performance |
On Fri, Jan 20, 2006 at 04:50:17PM -0500, Tom Lane wrote: > (hemdistcache calls hemdistsign --- I think gprof is doing something > funny with tail-calls here, and showing hemdistsign as directly called > from gtsvector_picksplit when control really arrives through hemdistcache.) It may be the compiler. All these functions are declared static, which gives the compiler quite a bit of leeway to rearrange code. > The bulk of the problem is clearly in this loop, which performs O(N^2) > comparisons to find the two entries that are furthest apart in hemdist > terms: Ah. A while ago someone came onto the list asking about bit strings indexing[1]. If I'd known tsearch worked like this I would have pointed him to it. Anyway, before he went off to implement it he mentioned "Jarvis-Patrick clustering", whatever that means. Probably more relevent was this thread[2] on -hackers a while back with pseudo-code[3]. How well it works, I don't know, it worked for him evidently, he went away happy... [1] http://archives.postgresql.org/pgsql-general/2005-11/msg00473.php [2] http://archives.postgresql.org/pgsql-hackers/2005-11/msg01067.php [3] http://archives.postgresql.org/pgsql-hackers/2005-11/msg01069.php Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Вложения
В списке pgsql-performance по дате отправления: