Re: [HACKERS] <> join selectivity estimate question
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] <> join selectivity estimate question |
Дата | |
Msg-id | 11738.1512322816@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | [HACKERS] <> join selectivity estimate question (Thomas Munro <thomas.munro@enterprisedb.com>) |
Список | pgsql-hackers |
Thomas Munro <thomas.munro@enterprisedb.com> writes: > So, in that plan we saw anti-join estimate 1 row but really there were > 13462. If you remove most of Q21 and keep just the anti-join between > l1 and l3, then you try removing different quals, you can see the the > problem is not the <> qual: > select count(*) > from lineitem l1 > where not exists ( > select * > from lineitem l3 > where l3.l_orderkey = l1.l_orderkey > and l3.l_suppkey <> l1.l_suppkey > and l3.l_receiptdate > l3.l_commitdate > ) > => estimate=1 actual=8998304 ISTM this is basically another variant of ye olde column correlation problem. That is, we know there's always going to be an antijoin match for the l_orderkey equality condition, and that there's always going to be matches for the l_suppkey inequality, but what we don't know is that l_suppkey is correlated with l_orderkey so that the two conditions aren't satisfied at the same time. The same thing is happening on a smaller scale with the receiptdate/commitdate comparison. I wonder whether the extended stats machinery could be brought to bear on this problem. regards, tom lane
В списке pgsql-hackers по дате отправления: