Re: Selectivity estimation for inet operators
От | Emre Hasegeli |
---|---|
Тема | Re: Selectivity estimation for inet operators |
Дата | |
Msg-id | 20141202211400.GB3990@hasegeli.com обсуждение исходный текст |
Ответ на | Re: Selectivity estimation for inet operators (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Selectivity estimation for inet operators
|
Список | pgsql-hackers |
> Actually, there's a second large problem with this patch: blindly > iterating through all combinations of MCV and histogram entries makes the > runtime O(N^2) in the statistics target. I made up some test data (by > scanning my mail logs) and observed the following planning times, as > reported by EXPLAIN ANALYZE: > > explain analyze select * from relays r1, relays r2 where r1.ip = r2.ip; > explain analyze select * from relays r1, relays r2 where r1.ip && r2.ip; > > stats target eqjoinsel networkjoinsel > > 100 0.27 ms 1.85 ms > 1000 2.56 ms 167.2 ms > 10000 56.6 ms 13987.1 ms > > I don't think it's necessary for network selectivity to be quite as > fast as eqjoinsel, but I doubt we can tolerate 14 seconds planning > time for a query that might need just milliseconds to execute :-( > > It seemed to me that it might be possible to reduce the runtime by > exploiting knowledge about the ordering of the histograms, but > I don't have time to pursue that right now. I make it break the loop when we passed the last possible match. Patch attached. It reduces the runtime almost 50% with large histograms. We can also make it use only some elements of the MCV and histogram for join estimation. I have experimented with reducing the right and the left hand side of the lists on the previous versions. I remember it was better to reduce only the left hand side. I think it would be enough to use log(n) elements of the right hand side MCV and histogram. I can make the change, if you think selectivity estimation function is the right place for this optimization.
Вложения
В списке pgsql-hackers по дате отправления: