Re: [PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes
От | Marti Raudsepp |
---|---|
Тема | Re: [PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes |
Дата | |
Msg-id | CABRT9RBP92ep4vzr3MHMOoMFP4GDmkTQ3fTmFnq-82X2o12rfA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [PATCH] Fix ScalarArrayOpExpr estimation for GIN indexes
|
Список | pgsql-hackers |
On Tue, Dec 20, 2011 at 07:08, Tom Lane <tgl@sss.pgh.pa.us> wrote: > it'd likely be better if this code ignored unrecognized qual expression > types rather than Assert'ing they're not there. The patch replaced that Assert with an elog(ERROR) > Hmm. I am reminded of how utterly unreadable "diff -u" format is for > anything longer than single-line changes :-( ... Sorry, the new patch is in context (-C) diff format proper. I also moved around code a bit and removed an unused variable that was left around from the refactoring. > but I think I don't > like this refactoring much. Will take a closer look tomorrow. I was afraid you'd say that, especially for a change that should be backpatched. But I couldn't think of alternative ways to do it that give non-bogus estimates. ---- While writing this patch, the largest dilemma was where to account for the multiple array scans. Given that this code is mostly a heuristic and I lack a deep understanding of GIN indexes, it's likely that I got this part wrong. Currently I'm doing this: partialEntriesInQuals *= array_scans; exactEntriesInQuals *= array_scans; searchEntriesInQuals *= array_scans; Which seems to be the right thing as far as random disk accesses are concerned (successive scans are more likely to hit the cache) and also works well with queries that don't touch most of the index. But this fails spectacularly when multiple full scans are performed e.g. LIKE ANY ('{%,%,%}'). Because index_pages_fetched() ends up removing all of the rescan costs. Another approach is multiplying the total cost from the number of scans. This overestimates random accesses from rescans, but fixes the above case: *indexTotalCost = (*indexStartupCost + dataPagesFetched * spc_random_page_cost) * array_scans; Regards, Marti
Вложения
В списке pgsql-hackers по дате отправления: