Re: [HACKERS] PATCH: multivariate histograms and MCV lists
От | Tomas Vondra |
---|---|
Тема | Re: [HACKERS] PATCH: multivariate histograms and MCV lists |
Дата | |
Msg-id | 49348328-1a02-7e45-e50f-e1d4d0243c5c@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] PATCH: multivariate histograms and MCV lists (Dean Rasheed <dean.a.rasheed@gmail.com>) |
Ответы |
Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Re: [HACKERS] PATCH: multivariate histograms and MCV lists |
Список | pgsql-hackers |
Hi Dean, Here is an updated patch (hopefully) fixing the bugs you've reported so far. In particular, it fixes this: 1) mostly harmless memset bug in UpdateStatisticsForTypeChange 2) passing the right list (stat_clauses) to mcv_clauselist_selectivity 3) corrections to a couple of outdated comments 4) handling of NOT clauses in MCV lists (and in histograms) The query you posted does not fail anymore, but there's a room for improvement. We should be able to handle queries like this: select * from foo where a=1 and not b=1; But we don't, because we only recognize F_EQSEL, F_SCALARLTSEL and F_SCALARGTSEL, but F_NEQSEL (which is what "not b=1" uses). Should be simple to fix, I believe. 5) handling of mcv_lowsel in statext_clauselist_selectivity I do believe the new behavior is correct - as I suspected, I broke this during the last rebase, where I also moved some stuff from the histogram part to the MCV part. I've also added the (sum of MCV frequencies), as you suggested. I think we could improve the estimate further by computing ndistinct estimate, and then using that to compute average frequency of non-MCV items. Essentially what var_eq_const does: if (otherdistinct > 1) selec /= otherdistinct; Not sure how to do that when there are not just equality clauses. BTW I think there's a bug in handling the fullmatch flag - it should not be passed to AND/OR subclauses the way it is, because then WHERE a=1 OR (a=2 AND b=2) will probably set it to 'true' because of (a=2 AND b=2). Which will short-circuit the statext_clauselist_selectivity, forcing it to ignore the non-MCV part. But that's something I need to look at more closely tomorrow. Another thing I probably need to do is to add more regression tests, protecting against bugs similar to those you found. Thanks for the feedback so far! regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
В списке pgsql-hackers по дате отправления: