indexed regex select optimisation missing?
От | Stuart Woolford |
---|---|
Тема | indexed regex select optimisation missing? |
Дата | |
Msg-id | 99110410093200.00683@test.macmillan.co.nz обсуждение исходный текст |
Ответ на | Problem: pq_recvbuf: unexpected EOF of client connection ("Natalya S. Makushina" <mak@rtsoft.msk.ru>) |
Ответы |
Re: [GENERAL] indexed regex select optimisation missing?
|
Список | pgsql-general |
Hello people. Please bear with me, as I think I may have found either a bug or 'missing optimisation feature' in postgresql 6.5.2.. I'm trying to use postgresql 6.5.2 to implement (amongst other things) a searchable word index, ie: I have a table called 'inv_word_i' which contains the fields create table inv_word_i ( key char(10), word char(16), count int ) ; and index create index i3 on inv_word_i ( word ); and currently has 1,654,662 entries (1.6 million), now, I currently search this with: select key from inv_word_i where word='whatever' order by count desc ; and this is fast, however, if I use: select key from inv_word_i where word~'^whatever.*' order by count desc ; it is very slow. the explains for both are: query 1: Sort (cost=67.82 rows=1057 width=16) -> Index Scan using i3 on inv_word_i (cost=67.82 rows=1057 width=16) query 2: Sort (cost=35148.70 rows=353 width=16) -> Index Scan using i3 on inv_word_i (cost=35148.70 rows=353 width=16) now, the explain on a query for ~'.*whatever.*' gives a cost of 70000. now, the documentation says that the index will be used for a regex query that has the left side tied (by using the ^ start of line match), and it seems to be doing this, but it then scans the whole rest of the file, not using the obvious optimisation of stoppping as soon as the non-variable left part of the regex is no longer matched. I have verified this, as the above ^ based regex search takes a long time for a word starting with z, and a massive time for one starting this a, and a linear difference as I progress through the alphabet. Is there any reason why this optimisation is not used? it seems like a rather important one to myself, and would speed up queries of this type by a massive amount (on average, the same amount as using the search to locate the start of the search scan) I've downloaded the source, and will start looking into this, but having not touched the postgresql source before, I'm not holding my breath fort a quick resolution from myself, any ideas? -- ------------------------------------------------------------ Stuart Woolford, stuartw@newmail.net Unix Consultant. Software Developer. Supra Club of New Zealand. ------------------------------------------------------------
В списке pgsql-general по дате отправления: