Re: benchmarking the query planner
От | Robert Haas |
---|---|
Тема | Re: benchmarking the query planner |
Дата | |
Msg-id | 603c8f070812111904j4b839c17sabde937e3d4cf08a@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: benchmarking the query planner (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: benchmarking the query planner
|
Список | pgsql-hackers |
>> OK, I'll bite. How do we decide where to put the cutoff? If we make >> it too high, it will have a negative effect on join selectivity >> estimates; if it's too low, it won't really address the problem we're >> trying to fix. I randomly propose p = 0.001, which should limit >> eqjoinsel() to about a million equality tests in the worst case. In >> the synthetic example we were just benchmarking, that causes the >> entire MCV array to be tossed out the window (which feels about >> right). > > Yeah. One idle thought I had was that maybe the cutoff needs to > consider both probabilities: if the high-frequency MCVs on one side > chance to match to not-so-high-frequency MCVs on the other, you > would like to know about that. As long as we keep the lists in > frequency order, it seems easy to implement this: for each MCV > examined by the outer loop, you run the inner loop until the product of > the outer and current inner frequency drops below whatever your > threshold is. This doesn't immediately suggest what the threshold I had this idle thought too, but I didn't write it down because... > ought to be, but it seems like it ought to be possible to determine > that given a desired maximum error in the overall estimate. I'm also > not very clear on what the "total frequency" computations (matchfreq2 > and unmatchfreq2 in the current code) ought to look like if we are using > a variable subset of the inner list. ...of this exact concern, which I think is an insurmountable problem. If you don't consider some of the MCVs AT ALL, I think you can add their frequencies back in to otherfreq{1,2} and go home, but if you consider them for some elements of the other list but not all, I'm not sure there's an appropriate way to proceed. ...Robert
В списке pgsql-hackers по дате отправления: