>> Functions have very unequal CPU costs, and we're talking here about >> using CPUs more effectively, why are costs being given the see-no-evil >> treatment? This is as true in core as it is in PostGIS, even if our >> case is a couple orders of magnitude more extreme: a filter based on a >> complex combination of regex queries will use an order of magnitude >> more CPU than one that does a little math, why plan and execute them >> like they are the same? > > Functions have user assignable costs.
We have done a relatively bad job of globally costing our functions thus far, because it mostly didn't make any difference. In my testing [1], I found that costing could push better plans for parallel sequence scans and parallel aggregates, though at very extreme cost values (1000 for sequence scans and 10000 for aggregates)
Obviously, if costs can make a difference for 9.6 and parallelism we'll rigorously ensure we have good, useful costs. I've already costed many functions in my parallel postgis test branch [2]. Perhaps the avoidance of cost so far is based on the relatively nebulous definition it has: about the only thing in the docs is "If the cost is not specified, 1 unit is assumed for C-language and internal functions, and 100 units for functions in all other languages. Larger values cause the planner to try to avoid evaluating the function more often than necessary."
So what about C functions then? Should a string comparison be 5 and a multiplication 1? An image histogram 1000?
We don't have a clear methodology for how to do this.
It's a single parameter to allow you to achieve the plans that work optimally. Hopefully that is simple enough for everyone to use and yet flexible enough to make a difference.
If its not what you need, show us and it may make the case for change.
--
Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services