Log Message:
-----------
This patch improves the planner.
------------------------------------
Tom Lane wrote:
> Dawid Kuroczko <qnex42@gmail.com> writes:
>> qnex=# EXPLAIN SELECT * FROM log NATURAL JOIN useragents LIMIT 1;
>> Limit (cost=15912.20..15912.31 rows=1 width=272)
>> -> Hash Join (cost=15912.20..5328368.96 rows=47044336 width=272)
>> If I set enable_hashjoin=false:
>> qnex=# EXPLAIN ANALYZE SELECT * FROM log NATURAL LEFT JOIN useragents LIMIT 1;
>> Limit (cost=0.00..3.07 rows=1 width=272) (actual time=74.214..74.216
>> rows=1 loops=1)
>> -> Nested Loop Left Join (cost=0.00..144295895.01 rows=47044336
>> width=272) (actual time=74.204..74.204 rows=1 loops=1)
> This is quite strange. The nestloop plan definitely should be preferred
> in the context of the LIMIT, considering that it has far lower estimated
> cost. And it is preferred in simple tests for me.
After a suitable period of contemplating my navel, I figured out
what is going on here: the total costs involved are large enough that
the still-fairly-high startup cost of the hash is disregarded by
compare_fuzzy_path_costs(), and so the nestloop is discarded as not
having any significant potential advantage in startup time.
I think that this refutes the original scheme of using the same fuzz
factor for both startup and total cost comparisons, and therefore
propose the attached patch.
------------------------------------
Modified Files:
--------------
bizgres/postgresql/src/backend/optimizer/util:
pathnode.c (r1.2 -> r1.3)
(http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/bizgres/bizgres/postgresql/src/backend/optimizer/util/pathnode.c.diff?r1=1.2&r2=1.3)