[PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes
От | Marti Raudsepp |
---|---|
Тема | [PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes |
Дата | |
Msg-id | CABRT9RCN+KTb5eHHkHFmdsioJ6qjGJMubOuMfMHq_esTqqDPqA@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: [PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes
|
Список | pgsql-hackers |
Hi list, Since PostgreSQL 9.1, GIN has new cost estimation code. This code assumes that the only expression type it's going to see is OpExpr. However, ScalarArrayOpExpr has also been possible in earlier versions. Estimating col <op> ANY (<array>) queries segfaults in 9.1 if there's a GIN index on the column. Case in point: create table words (word text); create index on words using gin (to_tsvector('english', word)); explain analyze select * from words where to_tsvector('english', word) @@ any ('{foo}'); (It seems that RowCompareExpr and NullTest clauses are impossible for a GIN index -- at least my efforts to find such cases failed) Attached is an attempted fix for the issue. I split out the code for the extract call and now run that for each array element, adding together the average of (partialEntriesInQuals, exactEntriesInQuals, searchEntriesInQuals) for each array element. After processing all quals, I multiply the entries by the number of array_scans (which is the product of all array lengths) to get the total cost. This required a fair bit of refactoring, but I tried to follow the patterns for OpExpr pretty strictly -- discounting scans over NULL elements, returning 0 cost when none of the array elements can match, accounting for cache effects when there are multiple scans, etc. But it's also possible that I have no idea what I'm really doing. :) I also added regression tests for this to tsearch and pg_trgm. Regards, Marti
Вложения
В списке pgsql-hackers по дате отправления: