BUG #16625: Query Optimizer - Performance bug related to removal of unnecessary aggregate function
От | PG Bug reporting form |
---|---|
Тема | BUG #16625: Query Optimizer - Performance bug related to removal of unnecessary aggregate function |
Дата | |
Msg-id | 16625-07403f18463add24@postgresql.org обсуждение исходный текст |
Ответы |
Re: BUG #16625: Query Optimizer - Performance bug related to removal of unnecessary aggregate function
Re: BUG #16625: Query Optimizer - Performance bug related to removal of unnecessary aggregate function |
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 16625 Logged by: XINYU LIU Email address: xinyuliu@umich.edu PostgreSQL version: 12.3 Operating system: Ubuntu 19.10 Description: Hello, We are developing a tool for automatically finding performance bugs in PostgreSQL. Our key insight is that given a pair of semantic equivalent queries, a robust DBMS should return the same result within a similar execution time. Significant time difference suggests a potential performance bug in the DBMS. We are sharing a pair of TPC-H queries that exhibit a potential performance bug in this report: First query: SELECT "s_suppkey" FROM "supplier" WHERE s_suppkey > 100; Second query: SELECT "s_suppkey" FROM "supplier" WHERE s_suppkey > 100 GROUP BY s_suppkey; [Actual Behavior] We executed both queries on the TPC-H benchmark of scale factor 5: the first query takes only 17 millisecond, while the second query takes 42 millisecond. We think the time difference results from different plans selected. [Query Execution Plan] First query: QUERY PLAN -------------------------------------------------------------------------------------------------------------- Seq Scan on supplier (cost=0.00..1734.00 rows=49909 width=4) (actual time=0.047..14.898 rows=49900 loops=1) Filter: (s_suppkey > 100) Rows Removed by Filter: 100 Planning Time: 0.639 ms Execution Time: 17.469 ms (5 rows) Second query: QUERY PLAN -------------------------------------------------------------------------------------------------------------------- HashAggregate (cost=1858.77..2357.86 rows=49909 width=4) (actual time=30.093..38.541 rows=49900 loops=1) Group Key: s_suppkey -> Seq Scan on supplier (cost=0.00..1734.00 rows=49909 width=4) (actual time=0.047..13.656 rows=49900 loops=1) Filter: (s_suppkey > 100) Rows Removed by Filter: 100 Planning Time: 0.669 ms Execution Time: 42.270 ms (7 rows) [Expected Behavior] Since these two queries are semantically equivalent, we were hoping that PostgreSQL will return the same results in roughly the same amount of time. [Test Environment] Ubuntu 19.10 PostgreSQL v12.3 Database: TPC-H benchmark (with scale factor 5) [Steps for reproducing our observations] * Download the dataset from the link: https://drive.google.com/file/d/13rFa1BNDi4e2RmXBn-yEQkcqt6lsBu1c/view?usp=sharing * Set up TPC-H benchmark tar xzvf tpch5_postgresql.tar.gz cd tpch5_postgresql db=tpch5 createdb $db psql -d $db < dss.ddl for i in `ls *.tbl` do echo $i name=`echo $i|cut -d'.' -f1` psql -d $db -c "COPY $name FROM '`pwd`/$i' DELIMITER '|' ENCODING 'LATIN1';" done psql -d $db < dss_postgres.ri *Execute the queries
В списке pgsql-bugs по дате отправления: