Re: PoC/WIP: Extended statistics on expressions
От | Tomas Vondra |
---|---|
Тема | Re: PoC/WIP: Extended statistics on expressions |
Дата | |
Msg-id | 958870c8-65e0-31b1-4591-b0b10e807dd9@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: PoC/WIP: Extended statistics on expressions (Dean Rasheed <dean.a.rasheed@gmail.com>) |
Ответы |
Re: PoC/WIP: Extended statistics on expressions
|
Список | pgsql-hackers |
On 12/11/20 1:58 PM, Dean Rasheed wrote: > On Tue, 8 Dec 2020 at 12:44, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: >> >> Possibly. But I don't think it's worth the extra complexity. I don't >> expect people to have a lot of overlapping stats, so the amount of >> wasted space and CPU time is expected to be fairly limited. >> >> So I don't think it's worth spending too much time on this now. Let's >> just do what you proposed, and revisit this later if needed. >> > > Yes, I think that's a reasonable approach to take. As long as the > documentation makes it clear that building MCV stats also causes > standard expression stats to be built on any expressions included in > the list, then the user will know and can avoid duplication most of > the time. I don't think there's any need for code to try to prevent > that -- just as we don't bother with code to prevent a user building > multiple indexes on the same column. > > The only case where duplication won't be avoidable is where there are > multiple MCV stats sharing the same expression, but that's probably > quite unlikely in practice, and it seems acceptable to leave improving > that as a possible future optimisation. > OK. Attached is an updated version, reworking it this way. I tried tweaking the grammar to differentiate these two syntax variants, but that led to shift/reduce conflicts with the existing ones. I tried fixing that, but I ended up doing that in CreateStatistics(). The other thing is that we probably can't tie this to just MCV, because functional dependencies need the per-expression stats too. So I simply build expression stats whenever there's at least one expression. I also decided to keep the "expressions" statistics kind - it's not allowed to specify it in CREATE STATISTICS, but it's useful internally as it allows deciding whether to build the stats in a single place. Otherwise we'd need to do that every time we build the statistics, etc. I added a brief explanation to the sgml docs, not sure if that's good enough - maybe it needs more details. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: