Possible bug in PostgreSQL query planner
| От | Matthew Wakeling |
|---|---|
| Тема | Possible bug in PostgreSQL query planner |
| Дата | |
| Msg-id | Pine.LNX.4.44.0210041521180.1333-300000@aragorn.flymine.org обсуждение исходный текст |
| Ответы |
Re: Possible bug in PostgreSQL query planner
|
| Список | pgsql-bugs |
I have encountered what seems to be a bad decision made by the query planner, on a quite simple query. I have attached a shell script which creates a large (180MB) sql script, to demonstrate the problem. To reproduce, execute the bash script, and pipe the stdout into psql. Redirect the stdout of psql somewhere, as it isn't very interesting. psql produces several EXPLAIN ANALYSE results on its stderr. I have also attached a copy of the results I get. The test creates three tables: jointest3_1 (keya int, infoa int) jointest3_2 (keya int, keyb int) jointest3_3 (keyb int, infob int) It then fills tables jointest3_1 and jointest3_3 each with 1 million rows, where the key value is sequential (0 to 999999), and the info value is a random zero or positive integer below 10000. It then fills table jointest3_2 with 2 million rows, each with random 6-digit integers in both of its columns. Therefore, table jointest3_2 represents a many-to-many mapping between jointest3_1 and jointest3_3, and the natural join of all three tables should have 2 million rows. The test then creates lots of indexes, clusters on the indexes, and does a vacuum analyse. The test does six EXPLAIN ANALYSE commands. There are actually three distinct commands, each of which is performed twice, to get a before-cached and after-cached performance indication. The first command does a natural join of all three tables, and restricts the results on jointest3_1.infoa. This is performed quickly, as there is an index on infoa which returns approximately 100 rows. The second distinct command also does a natural join of all three tables, but restricts the results by jointest3_3.infob. The query planner appears to make a bad decision, and the query takes a long time to return. The third distinct command does the same query as the second distinct command, but it forces the query planner to join tables jointest3_2 and jointest3_3 together before joining with jointest3_1 by doing a subselect. This performs as quickly as the first distinct command. I would have expected the query planner to find the fast query plan, given there are only three tables being joined together. I am using PostgreSQL version 7.2.1, running on Debian unstable/testing, reasonably up to date. If you need more details, please email back. select version(); gives the following: PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.95.4 PostgreSQL was installed as a Debian package. The machine is a Athlon XP1900, with 512MB of RAM, a 40GB hard drive. Kernel: Linux 2.4.19-686 libc6: 2.2.5-14.1 Matthew Wakeling
В списке pgsql-bugs по дате отправления: