Re: [HACKERS] extended statistics: n-distinct
От | Alvaro Herrera |
---|---|
Тема | Re: [HACKERS] extended statistics: n-distinct |
Дата | |
Msg-id | 20170322210345.zoqj4tmdyoh23mxm@alvherre.pgsql обсуждение исходный текст |
Ответ на | Re: [HACKERS] extended statistics: n-distinct (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Ответы |
Re: extended statistics: n-distinct
|
Список | pgsql-hackers |
Kyotaro HORIGUCHI wrote: > At Mon, 20 Mar 2017 16:02:20 -0300, Alvaro Herrera <alvherre@2ndquadrant.com> wrote in <20170320190220.ixlaueanxegqd5gr@alvherre.pgsql> > > This is a new thread to present a version of the n-distinct patch that > > IMO is close enough to commit. There are some work items still. > > There's some discussion on the topic of cross-column statistics: > > https://wiki.postgresql.org/wiki/Cross_Columns_Stats > > > > This problem is important enough that Kyotaro Horiguchi submitted > > another patch that does the same thing: > > https://www.postgresql.org/message-id/flat/20150828.173334.114731693.horiguchi.kyotaro%40lab.ntt.co.jp > > This patch aims to provide the same functionality, keeping the design > > general enough that other kinds of statistics can be added later (such > > as functional dependencies, histograms and MCVs, all of which have been > > previously submitted as patches by Tomas). > > I may be stupid but I don't get the picture here, specifically > about the relation to Tomas's patch. Does this work as > infrastructure for Tomas's mv patch? Or in some other > relationsip? Well, this patch is Tomas' first patch, which I've reviewed and reworked -- I changed some things that weren't properly finished, cleaned up the code, made it all more robust, and made sure the sane cases work sanely while the others rejected promptly (rather than throwing bogus error messages at a later time, or crashing). I didn't review your own n-distinct patch. I don't think there's any common code, but it would be very useful if you could try your test scenarios and make sure they are handled sanely by this patch. Regarding your question: > Do you planning to realize correcting esitimation of joins > perplexed by strong correlations? There is a later patch in Tomas' series which I would like to get to before PG10 closes, but it's not this patch. It needs to be rebased on top of this one. Attached is v30, which includes some more cleanup. Detailed commits can be seen here: https://github.com/2ndQuadrant/postgres/commits/dev/mvstats-ndistinct In particular, this includes code from Tomas to consider mixing ndistinct estimates from multiple multivariate statistic objects, which is better than the old approach of only using the estimate when a perfect match was found. However, I lobotomized Tomas' selfuncs.c code however and I need to revert that part before pushing -- essentially I removed examine_variable() processing, which seemed a bit on the expensive side for what we were doing, but that was a silly mistake. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
В списке pgsql-hackers по дате отправления: