Re: BUG #16625: Query Optimizer - Performance bug related to removal of unnecessary aggregate function
От | David Rowley |
---|---|
Тема | Re: BUG #16625: Query Optimizer - Performance bug related to removal of unnecessary aggregate function |
Дата | |
Msg-id | CAApHDvqKE2MLsQidBtY8AaM+J9VWN5uct3dQ92HnfDKzc4f=cQ@mail.gmail.com обсуждение исходный текст |
Ответ на | BUG #16625: Query Optimizer - Performance bug related to removal of unnecessary aggregate function (PG Bug reporting form <noreply@postgresql.org>) |
Список | pgsql-bugs |
On Sun, 20 Sep 2020 at 03:53, PG Bug reporting form <noreply@postgresql.org> wrote: > We are developing a tool for automatically finding performance bugs in > PostgreSQL. Our key insight is that given a pair of semantic equivalent > queries, a robust DBMS should return the same result within a similar > execution time. Significant time difference suggests a potential performance > bug in the DBMS. It's important to differentiate between bugs and optimisations that PostgreSQL does not perform. I imagine the findings of running this tool is best directed towards a wiki page on https://wiki.postgresql.org/ It's also important to always go ahead and apply such optimisations without any regard to the cost of checking if the optimisation can apply. For optimisations like this, it's not always just a simple case of attempting to apply them regardless. Checking for such cases will penalise queries where the optimisation cannot be applied. The people that benefit are the ones that write bad SQL and the people that lose out are the ones who quite good SQL. That's not a particularly good incentive to write good SQL. In some cases, the cost of checking if the optimisation can be applied will be so negligible that it's worth it as the gains are good if it can be applied. The answer to whether this the case for this particilar optimisation will depend on who you ask. > We are sharing a pair of TPC-H queries that exhibit a potential performance > bug in this report: > > First query: > SELECT "s_suppkey" > FROM "supplier" > WHERE s_suppkey > 100; > > Second query: > SELECT "s_suppkey" > FROM "supplier" > WHERE s_suppkey > 100 > GROUP BY s_suppkey; > > [Actual Behavior] > We executed both queries on the TPC-H benchmark of scale factor 5: the first > query takes only 17 millisecond, while the second query takes 42 > millisecond. We think the time difference results from different plans > selected. Accounting for the information on the other email that mentions s_suppkey is the primary key of the supplier table, this is not a bug. It's simply an optimisation that we currently don't apply. There's a patch around that aims to implement this but it's not yet been applied. So the optimisation may appear in some future version of PostgreSQL. If you'd like to help with that then please look at https://commitfest.postgresql.org/29/2433/ I'd suggest a wiki page would be a good place to note down other possible future optimisations. There's no shortage of possible optimisations that we don't apply, and an endless stream of bug reports is not the way to have new optimisations added to the PostgreSQL planner. David
В списке pgsql-bugs по дате отправления: