Re: [HACKERS] having and union in v7beta
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] having and union in v7beta |
Дата | |
Msg-id | 6366.951937607@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [HACKERS] having and union in v7beta (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
I wrote: > I compiled up current code with profiling enabled (make PROFILE=-pg > if you want to try it), and found that actually nearly all of the > runtime for > select * from comuni union select * from comuni > is spent in the sort step; so I was on the wrong track in guessing > that there might be a performance problem in the new Unique coding. Wait a second. Stop the presses. I see what's going on here. 6.5.*: play=> explain select * from comuni union select * from comuni; NOTICE: QUERY PLAN: Seq Scan on comuni (cost=512.00 rows=10000 width=84) 7.0beta1: regression=# explain select * from comuni union select * from comuni; NOTICE: QUERY PLAN: Unique (cost=149.66..184.66 rows=200 width=84) -> Sort (cost=149.66..149.66 rows=2000 width=84) -> Append (cost=0.00..40.00rows=2000 width=84) -> Seq Scan on comuni (cost=0.00..20.00 rows=1000 width=84) -> Seq Scan on comuni (cost=0.00..20.00 rows=1000 width=84) 7.0beta1's behavior is actually "correct", in the sense that it yields the SQL-approved result: the UNION implies a DISTINCT pass over its result, according to SQL, and 7.0beta1 is giving you a DISTINCT result. 6.5 is failing to generate the DISTINCT operation, because it incorrectly simplifies "select foo union select foo" into "select foo" if the two select queries are identical. (There is a TODO item for this.) So that's why 6.5 is a lot faster. But it gives the wrong answer. *However*, we have not fixed the bug that causes "select foo union select foo" to be incorrectly simplified --- the UNION code is still applying cnfify. (Which it probably shouldn't, but I haven't wanted to touch that code until I have the time to rewrite it completely.) The reason 7.0beta1 generates the "right" answer is that it has a recently-introduced bug in the comparison routines that causes it to think the two select subqueries aren't the same. I just fixed that bug, with the result that current CVS code is now back to mis-simplifying this query. (Yes, this is a step forward --- that bug could have caused the system to unify two queries that AREN'T the same, which would definitely be a bad thing...) So, thanks! You did indeed identify a bug! But you should expect that this query *will* get slower when we fix the other bug ;-). You should use a less silly test case for UNION if you want to make realistic performance comparisons across versions. regards, tom lane
В списке pgsql-hackers по дате отправления: