text_position worst case runtime
От | Mark Dilger |
---|---|
Тема | text_position worst case runtime |
Дата | |
Msg-id | e4j2qg$q3q$1@news.hub.org обсуждение исходный текст |
Ответы |
Re: text_position worst case runtime
|
Список | pgsql-hackers |
The function static int32 text_position(text *t1, text *t2, int matchnum) defined in src/backend/utils/adt/varlena.c uses repeated calls to strncmp (or pg_wchar_strncmp) to find the location of the pattern in the text. The worst case runtime for such an approach is O(n*m) where n and m are the lengths of the pattern and text. The best case would be O(n), I guess, because it only takes n comparisons to find the pattern at the very start of the text. I'm not sure how to determine the average case runtime, because it depends what your data looks like on average. The fastest worst-case algorithmic solutions (boyer-moore and similar) have an O(n+m) worst-case runtime. In practice, the current iterative strncmp solution may be faster than the boyer-moore solution for average data, because the data may not be constructed in such a way as to trigger worst-case or nearly worst-case run times. The current solution also has the advantage of not requiring a palloc of O(n) space for the pattern preprocessing that boyer-moore requires. My question is, would the core development team entertain the idea of changing this function if I supplied the new code? Thanks, mark
В списке pgsql-hackers по дате отправления: