Re: [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker)
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker) |
Дата | |
Msg-id | 11787.1220580297@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker) (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Ответы |
Re: [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker)
|
Список | pgsql-patches |
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes: > After reading the wikipedia article on Boyer-Moore search algorithm, it > looks to me like this patch actually implements the simpler > Boyer-Moore-Horspool algorithm that only uses one lookup table. That's > probably fine, as it ought to be faster on small needles and haystacks > because it requires less effort to build the lookup tables, even though > the worst-case performance is worse. It should still be faster than what > we have now. Hmm. B-M-H has worst case search speed O(M*N) (where M = length of pattern, N = length of search string); whereas full B-M is O(N). Maybe we should build the second table when M is large? Although realistically that is probably gilding the lily, since frankly there haven't been many real-world complaints about the speed of these functions anyway ... > The skip table really should be constructed only once in > text_position_start and stored in TextPositionState. That would make a > big difference to the performance of those functions that call > text_position_next repeatedly: replace_text, split_text and text_to_array. +1. The heuristic about big a skip table to make may need some adjustment as well, since it seems to be considering only a single search. regards, tom lane
В списке pgsql-patches по дате отправления: