Re: Should we update the random_page_cost default value?
| От | Jim Nasby | 
|---|---|
| Тема | Re: Should we update the random_page_cost default value? | 
| Дата | |
| Msg-id | CAMFBP2omp3v6pcThHJnW7zjYak6Dqob-__QY-OLuJfqNPyY6fA@mail.gmail.com обсуждение исходный текст  | 
		
| Ответ на | Re: Should we update the random_page_cost default value? (Bruce Momjian <bruce@momjian.us>) | 
| Список | pgsql-hackers | 
I definitely agree with Tom's idea that the costing model needs a re-think, if for no other reason than the volume of discussion on this thread that (AFAICT) has reached precious few conclusions. Along those lines, the two biggest problems I see with the current state of affairs are:
1) We blindly take the lowest cost query, regardless of how close #2 is. More importantly, we do nothing to model how accurate the cost estimate is. In the realm of OLTP, stability is FAR, *FAR* more important than absolute efficiency. Short of forcing specific query plans, the next best thing for stability would be to consider the estimated accuracy of estimates: I'd much rather have a plan with a 2x higher estimate but 90% confidence than a plan that's half the cost but with only 10% confidence.
When it comes to OLAP, it might be useful to have the ability to start executing several different query plans, and monitor how well reality matches the cost estimates (though dealing with nodes that have to process all data before producing tuples would be a bit of a challenge). If a query's going to run for 30 minutes, letting a few copies of it run for 30 seconds to evaluate things could well be a fruitful tradeoff.
2) We have essentially no ability to actually measure how well things are working. The thread has shown that even building controlled tests is difficult, given all the moving parts in the executor that we have little visibility into (just one example: the actual cost of deforming tuples). Getting useful data from user's systems is at least as difficult: even an EXPLAIN (ANALYZE, VERBOSE, BUFFERS) doesn't tell you anything about how the planner decided on a plan, or what the alternatives looked like (as well as missing the data you'd need to know if cost coefficients were anywhere close to reasonable).
Speaking of cost coefficients... I'm kinda doubtful of any effort to change the default of random_page_cost that doesn't also look at the cpu_* GUCs. Even assuming sequential access is 0.01ms, the default cpu_tuple_cost of 0.01 means 100us to process a tuple. Even if that's reasonable, it also means 25us for cpu_operator_cost, which seems pretty high even for text comparisons (and is obviously absurd for int or float).
Given the pain involved with trying to manage all these coefficients I do wonder if there's any possible way to determine them empirically on any given system, perhaps using clock cycle counts. That could potentially help with #2 (and might also allow estimating what percent of IO is coming from a cache outside of shared buffers, as opposed to real IO).
On Fri, Oct 31, 2025 at 10:41 AM Bruce Momjian <bruce@momjian.us> wrote:
I thinking more about how we can improve random_page_cost, I now realize
that we will never perfectly model the caching effects, and even if we
did, the possible time gap between planning and execution would rander
it unreliable anyway, i.e., prepared queries.
I think the best we could do is to model how _volatile_ the cache is,
meaning how often are we removing and adding items to the shared buffer
cache, and somehow mix that to the random_page_cost value. That value
might be fairly stable over a longer period of time.
In my experience on OLTP systems, that volatility is effectively quite low, because these systems generally start behaving quite poorly as the critical "working set" of data starts to exceed cache (shared_buffers + filesystem). That certainly may not be true at the level of shared_buffers, and of course over time the specific pages living in cache will slowly change.
One important caveat to that though: it only applies to the (usually rather small) set of queries that are run all the time by the application. Infrequently run queries can have radically different behavior, but their performance is generally not very critical (as long as they don't impact the rest of the system). 
В списке pgsql-hackers по дате отправления: