Re: Erronous sort used in query plan
От | Tom Lane |
---|---|
Тема | Re: Erronous sort used in query plan |
Дата | |
Msg-id | 13828.1168194538@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Erronous sort used in query plan (Shane Ambler <pgsql@007Marketing.com>) |
Ответы |
Re: Erronous sort used in query plan
|
Список | pgsql-hackers |
Shane Ambler <pgsql@007Marketing.com> writes: > I am putting together searches on the catalog info and came up with a > select that was rather slow and I noticed that in the explain analyze > there is a sort step on one of the left joins which I don't think > belongs there. Well, it's certainly necessary in context because it's preparing the data for the merge join immediately above it. The correct question is why is the thing using a merge join here, when a hash join would be cheaper? I dug through this and found out that the hash join is estimated as cheaper, right up till the last step of cost_hashjoin: /* * Bias against putting larger relation on inside. We don't want an * absolute prohibition, though, since largerrelation might have better * bucketsize --- and we can't trust the size estimates unreservedly, * anyway. Instead,inflate the run cost by the square root of the size * ratio. (Why square root? No real good reason, but it seems * reasonable...) * * Note: before 7.4 we implemented this by inflating startup cost; but if * there's adisable_cost component in the input paths' startup cost, that * unfairly penalizes the hash. Probably it'd be betterto keep track of * disable penalty separately from cost. */ if (innerbytes > outerbytes && outerbytes > 0) run_cost *= sqrt(innerbytes / outerbytes); In this example, the data volume from the join of everything else is estimated as less than what needs to be fetched from pg_proc, and so this bias kicks in, and the cost estimate roughly doubles. Unfortunately, because it's a LEFT JOIN, we'll never consider hashjoin in the other direction and so the hash loses out to the mergejoin. It seems clear to me that we ought not impose a bias unless the join type is such that both directions of hashing are feasible. I wonder also if the bias is too large ... but there's not really evidence for or against that in this example. The point is that this code implicitly assumes both directions will be tried, and they won't. regards, tom lane
В списке pgsql-hackers по дате отправления: