Re: [GENERAL] Re: [SQL] bad select performance fixed by forbidding hash joins
От | Leon |
---|---|
Тема | Re: [GENERAL] Re: [SQL] bad select performance fixed by forbidding hash joins |
Дата | |
Msg-id | 3796142C.B577425D@udmnet.ru обсуждение исходный текст |
Ответ на | Re: [SQL] bad select performance fixed by forbidding hash joins (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-general |
Tom Lane wrote: > > The right long-term solution is to figure out why the system is > misestimating the relative costs of the two plans, and fix the cost > estimates. (The system is estimating that the mergejoin is about 4x > slower than hash; if it's really 8x faster, there is something pretty > broken about the estimate...) > I am pleased that someone is taking seriously the speed of joins problem, and here is a word or two of mine to say. The main feature of SQL as a language is it's generality. It doesn't tell how to make things. It only expresses the result desired. This is a strong side, because programmer doesn't have to tell the procedure, and it's at the same time the weakness. This is weakness because the problem of deciding on a way is handed over to the server, since there seems theoretically no way to estimate the cost of every query without doing that query. It is proven very easily. Suppose you have a join with qualification 'where a = 10'. Having statistics of how many there are rows with a == 10 you can tell the right way. Then suppose you have join with qualification 'table1.a = table2.b'. This is much much more difficult to be estimated in the right way, although still possible. If you have qualification of the kind 'SOME_WEIRD_FUNCTION(table1.a) = ANOTHER_WEIRD_FUNCTION(table2.b)' it becomes impossible to tell the right way. There is a fundamental flaw in SQL which makes it difficult to realize in practice with high efficiency. The only way to solve the problem generally is to commit a sin against purism and start giving hints to server. I don't state that there is no way of improving concrete Postgres optimizer. Definitely there is. As far as I can see, now it's optimizer is too pessimistic about sizes of the results of joins, it overestimates them badly. If you make it think that it is going to get fewer rows from a query, it will behave better. Maybe you should even get a state variable, setting of which could regulate it's degree of optimism in joins. But all that is quick and dirty hack. General way is hints, however. -- Leon. --------- "This may seem a bit weird, but that's okay, because it is weird." - Perl manpage.
В списке pgsql-general по дате отправления: