Using multiple extended statistics for estimates
От | Tomas Vondra |
---|---|
Тема | Using multiple extended statistics for estimates |
Дата | |
Msg-id | 20191028152048.jc6pqv5hb7j77ocp@development обсуждение исходный текст |
Ответы |
Re: Using multiple extended statistics for estimates
|
Список | pgsql-hackers |
Hi, PostgreSQL 10 introduced extended statistics, allowing us to consider correlation between columns to improve estimates, and PostgreSQL 12 added support for MCV statistics. But we still had the limitation that we only allowed using a single extended statistics per relation, i.e. given a table with two extended stats CREATE TABLE t (a int, b int, c int, d int); CREATE STATISTICS s1 (mcv) ON a, b FROM t; CREATE STATISTICS s2 (mcv) ON c, d FROM t; and a query SELECT * FROM t WHERE a = 1 AND b = 1 AND c = 1 AND d = 1; we only ever used one of the statistics (and we considered them in a not particularly well determined order). This patch addresses this by using as many extended stats as possible, by adding a loop to statext_mcv_clauselist_selectivity(). In each step we pick the "best" applicable statistics (in the sense of covering the most attributes) and factor it into the oveall estimate. All this happens where we'd originally consider applying a single MCV list, i.e. before even considering the functional dependencies, so roughly like this: while () { ... apply another MCV list ... } ... apply functional dependencies ... I've both in the loop, but I think that'd be wrong - the MCV list is expected to contain more information about individual values (compared to functional deps, which are column-level). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
В списке pgsql-hackers по дате отправления: