Re: using extended statistics to improve join estimates
От | Tomas Vondra |
---|---|
Тема | Re: using extended statistics to improve join estimates |
Дата | |
Msg-id | 2ed75657-e084-9539-c6de-597e5675014c@enterprisedb.com обсуждение исходный текст |
Ответ на | using extended statistics to improve join estimates (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Ответы |
Re: using extended statistics to improve join estimates
|
Список | pgsql-hackers |
Hi, Here's a slightly improved / cleaned up version of the PoC patch, removing a bunch of XXX and FIXMEs, adding comments, etc. The approach is sound in principle, I think, although there's still a bunch of things to address: 1) statext_compare_mcvs only really deals with equijoins / inner joins at the moment, as it's based on eqjoinsel_inner. It's probably desirable to add support for additional join types (inequality and outer joins). 2) Some of the steps are performed multiple times - e.g. matching base restrictions to statistics, etc. Those probably can be cached somehow, to reduce the overhead. 3) The logic of picking the statistics to apply is somewhat simplistic, and maybe could be improved in some way. OTOH the number of candidate statistics is likely low, so this is not a big issue. 4) statext_compare_mcvs is based on eqjoinsel_inner and makes a bunch of assumptions similar to the original, but some of those assumptions may be wrong in multi-column case, particularly when working with a subset of columns. For example (ndistinct - size(MCV)) may not be the number of distinct combinations outside the MCV, when ignoring some columns. Same for nullfract, and so on. I'm not sure we can do much more than pick some reasonable approximation. 5) There are no regression tests at the moment. Clearly a gap. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: