Multiple grouping set specs referencing duplicate alias
От | David Kimura |
---|---|
Тема | Multiple grouping set specs referencing duplicate alias |
Дата | |
Msg-id | CAHnPFjSdFx_TtNpQturPMkRSJMYaD5rGP2=8iFH9V24-OjHGiQ@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Multiple grouping set specs referencing duplicate alias
|
Список | pgsql-hackers |
Hi all! I think I may have stumbled across a case of wrong results on HEAD (same through version 9.6, though interestingly 9.5 produces different results altogether). test=# SELECT i AS ai1, i AS ai2 FROM generate_series(1,3)i GROUP BY ai2, ROLLUP(ai1) ORDER BY ai1, ai2; ai1 | ai2 -----+----- 1 | 1 | 1 2 | 2 | 2 3 | 3 | 3 (6 rows) I had expected: ai1 | ai2 -----+----- 1 | 1 2 | 2 3 | 3 | 1 | 2 | 3 (6 rows) It seems to me that the plan is missing a Sort node (on ai1 and ai2) above the Aggregate node. QUERY PLAN ------------------------------------------------ GroupAggregate Group Key: i, i Group Key: i -> Sort Sort Key: i -> Function Scan on generate_series i I have a hunch part of the issue may be an assumption that a duplicate aliased column will produce the same column values and hence isn't included in the range table, nor subsequently the pathkeys. However, that assumption does not seem to be true for queries with multiple grouping set specifications: test=# SELECT i as ai1, i as ai2 FROM (values (1),(2),(3)) v(i) GROUP BY ai1, ROLLUP(ai2); ai1 | ai2 -----+----- 1 | 1 2 | 2 3 | 3 1 | 2 | 3 | (6 rows) It seems to me that excluding the duplicate alias from the pathkeys is leading to a case where the group order is incorrectly determined to satisfy the sort order. Thus create_ordered_paths() decides against applying an explicit sort node. But simply forcing an explicit sort still seems wrong since we've effectively lost a relevant column for the sort. I tinkered a bit and hacked together an admittedly ugly patch that triggers an explicit sort constructed from the parse tree. An alternative approach I had considered was to update the rewriteHandler to explicitly force the existence of the duplicate alias column in the range tables. But that also felt meh. Does this seem like a legit issue? And if so, any thoughts on alternative approaches? Thanks, David Kimura
Вложения
В списке pgsql-hackers по дате отправления: