Re: [HACKERS] No: implied sort with group by
От | Thomas G. Lockhart |
---|---|
Тема | Re: [HACKERS] No: implied sort with group by |
Дата | |
Msg-id | 34CEDF54.85CBAAB@alumni.caltech.edu обсуждение исходный текст |
Ответ на | Re: [HACKERS] No: implied sort with group by (darrenk@insightdist.com (Darren King)) |
Список | pgsql-hackers |
> > > > This is what I think is missing or broken right now. > > > > > > > > > > select * from t1; > > > > > a b c > > > > > 1 x > > > > > 2 x > > > > > 3 z > > > > > 2 x > > > > > > > > > > 4 row(s) retrieved. > > > > > > select b,c,sum(a) from t1 group by b,c; > > > > > b c (sum) > > > > > > > > > > x 5 > > > > > z 3 > > > > >> 2 row(s) retrieved. > > > > > > Sorry, I've lost the thread. What is broken? I get this same result, and > > > (assuming that column "b" is full of nulls) I think this the correct result. > > > > At one point, it was thought that NULLs shouldn't be grouped, but I > > backed out the patch. There is a problem with GROUP BY on large > > datasets, and Vadim knows the cause, and will work on it later. > > Different from the grouping by NULLs issue... > > The above results are from Sybase. If these same four rows are inserted into > postgres, the second query will return three rows. Something like... > > b|c|sum(a) > |x|3 > |z|3 > |x|2 > > It does this not because of the null values of column b, but because the data is > not sorted before getting to the group by node if the user does not explicitly put > an order by in the query. IMHO, postgres should put an arbitrary sort node in the > tree so that the data can be properly grouped as the group node iterates over it. > > And even if I put an "order by c" clause in there, I still get three rows, they're > just properly sorted. :) Not necessarily true; as I said, I get the same result as above (with the 980112 source tree; have things changed since??). Perhaps you are running into the sorting problem which seemed to be present on larger tables only? - Tom postgres=> select b,c,sum(a) from t1 group by b,c; b|c|sum -+-+--- |x| 5 |z| 3 (2 rows) postgres=> select * from t1; a|b|c -+-+- 1| |x 2| |x 2| |x 3| |z (4 rows)
В списке pgsql-hackers по дате отправления: