Re: Nested loop does not preserve order. Why?
От | Tom Lane |
---|---|
Тема | Re: Nested loop does not preserve order. Why? |
Дата | |
Msg-id | 19836.1392132971@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Nested loop does not preserve order. Why? (Alexey Bashtanov <bashtanov@imap.cc>) |
Список | pgsql-bugs |
Alexey Bashtanov <bashtanov@imap.cc> writes: > Let us have a table T(A, B), an index on T(B, A) and a much smaller > table V(B). > And we are looking for all rows of T that have B present in V and sorted > by B, A. > The most natural way to obtain them would be to take all records from V > ordered by B, and to perform an index-only scan on T for each B. > But not for postgresql :( AFAICS, what you're wishing is that given a plan like nestloop Join condition: x = y -> path ordered by x -> path ordered by y, z the planner would realize that the output of the nestloop is ordered by y,z. Right now it only realizes that the output is ordered by x (and hence also by y); this means it thinks it needs an additional sort step. As stated, though, this deduction about ordering is false. To make it true, you need the additional assumption that the outer relation is unique in x (plus assumptions about the equality operators in question all being compatible with the sort ordering). Right now, the planner doesn't really track uniqueness so there's no practical way to do this. There's a patch in the current commitfest that wants to add tracking of unique sort orders, so if that gets in we might have appropriate infrastructure. But even then, this seems like a mighty narrow and unusual corner case; I can't recall ever seeing a request for such behavior before. So I doubt we'd do it unless the deduction turns out to be basically free, which seems unlikely. Having said that, there does seem to be something fishy going on here. If I add an index on v.b to your example, and drop the explicit request to sort on a, I can get this: EXPLAIN select b, a from u where b in (select b from v) order by b; QUERY PLAN ----------------------------------------------------------------------------------------------- Sort (cost=13802.17..14052.17 rows=100000 width=8) Sort Key: u.b -> Nested Loop (cost=8.89..4128.85 rows=100000 width=8) -> Unique (cost=8.45..8.50 rows=10 width=4) -> Sort (cost=8.45..8.48 rows=10 width=4) Sort Key: v.b -> Index Only Scan using v_b_idx on v (cost=0.14..8.29 rows=10 width=4) -> Index Only Scan using u_i on u (cost=0.43..312.04 rows=10000 width=8) Index Cond: (b = v.b) Planning time: 0.481 ms (10 rows) I'd have expected the planner to realize that neither of those sort steps are necessary. Will investigate. regards, tom lane
В списке pgsql-bugs по дате отправления: