RE: [HACKERS] Proposed cleanup of index-related planner estimation procedures
От | Hiroshi Inoue |
---|---|
Тема | RE: [HACKERS] Proposed cleanup of index-related planner estimation procedures |
Дата | |
Msg-id | 000501bf58a3$88a213a0$2801007e@tpf.co.jp обсуждение исходный текст |
Ответ на | Proposed cleanup of index-related planner estimation procedures (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
> -----Original Message----- > From: owner-pgsql-hackers@postgreSQL.org > [mailto:owner-pgsql-hackers@postgreSQL.org]On Behalf Of Tom Lane > > I am thinking about redefining and simplifying the planner's interface > to index-type-dependent estimation routines. > > Currently, each defined index type is supposed to supply two routines: > an "amopselect" routine and an "amopnpages" routine. (The existing > actual routines of this kind are btreesel, btreenpages, etc in > src/backend/utils/adt/selfuncs.c.) These things are called by > index_selectivity() in src/backend/optimizer/util/plancat.c. amopselect > tries to determine the selectivity of an indexqual (the fraction of > main-table tuples it will select) and amopnpages tries to determine > the number of index pages that will be read to do it. > > Now, this collection of code is largely redundant with > optimizer/path/clausesel.c, which also tries to estimate the selectivity > of qualification conditions. Furthermore, the interface to these > routines is fundamentally misdesigned, because there is no way to deal > with interrelated qualification conditions --- for example, if we have > a range query like "... WHERE x > 10 AND x < 20", the code estimates > the selectivity as the product of the selectivities of the two terms > independently, but the actual selectivity is very different from that. > I am working on fixing clausesel.c to be smarter about correlated > conditions, but it won't do much good to fix that code without fixing > the index-related code. > > What I'm thinking about doing is replacing these two per-index-type > routines with a single routine, which is called once per proposed > indexscan rather than once per qual clause. It would receive the > whole indexqual list as a parameter, instead of just one qual. > A typical implementation would just call clausesel.c's general-purpose > code to estimate the selectivity, and then do a little bit of extra > work to derive the estimated number of index pages from that number. > Seems good to me. I have also been suspicious about per qual selectivity and have another exmaple. For the following queryselect * from .. where col1=val1 and col2=val2; the selectivity is selectivity of (col1=val1) * selectivity of (col2=val2) currently. But it's not right in many cases. Though it's almost impossible to hold disbursions for all combination of columns,it may be possible to hold multi-column disbursions for multi-columns indexes, Regards. Hiroshi Inoue Inoue@tpf.co.jp
В списке pgsql-hackers по дате отправления: