Re: Help refining/eliminating recursive selects
От | Yudie |
---|---|
Тема | Re: Help refining/eliminating recursive selects |
Дата | |
Msg-id | 002501c3f746$cb3fe700$8401a8c0@home.axiontech.com обсуждение исходный текст |
Ответ на | Help refining/eliminating recursive selects (Edmund Bacon <ebacon@onesystem.com>) |
Список | pgsql-sql |
What about something like this: select * from test order by tx_date desc limit 1 Yudie ----- Original Message ----- From: "Edmund Bacon" <ebacon@onesystem.com> To: <pgsql-sql@postgresql.org> Sent: Thursday, February 19, 2004 5:09 PM Subject: [SQL] Help refining/eliminating recursive selects I have the following table: create table test ( id serial primary key, product integer, tx_date date, quantity integer) with the following data:id | product | tx_date | quantity ----+---------+------------+---------- 1 | 1 | 2004-01-01 | 10 2 | 2 | 2004-01-01 | 8 3 | 3 | 2004-01-01 | 7 4 | 4 | 2004-01-01 | 12 5 | 1 | 2004-01-15 | 9 6 | 2 | 2004-01-15| 12 7 | 3 | 2004-01-15 | 8 8 | 5 | 2004-01-07 | 15 what I want to do is to find the most recent record for each product in the table. The only ways I seem to be able to achieve this is by one of the following 1) A self join: SELECT * FROM test JOIN (SELECT product, max(tx_date) AS tx_date FROM test GROUP BY product) x USING(product, tx_date); 2) A correlated subquery: SELECT * FROM test t WHERE tx_date = ( SELECT max(tx_date) FROM test WHERE product = t.product); or 3) a two-part select: SELECT product, max(tx_date) AS tx_date INTO TEMP TABLE t_prod_date FROM test GROUP BY product; SELECT * FROM test JOIN t_prod_date USING(product, tx_date); I can't help but feel like I'm missing something simple that would do what I want and not mean I need to scan the table multiple times. Is there a better way? In trying to answer some questions in advance: The two-part select _IS_ comparitively slow on the above dataset. In my actual situation, I have about 300 possible products and over 20,00 records to sort through. In that case the overhead of creating the temp table is easily overcome by reducing the search space. In my case the two-part select runs in about 2/3 the time the self-join. Note that we are currently talking about .5 and .3 seconds, but the dataset is growing. The correlated-subquery on the large dataset is horribly slow, comparitively speaking, at about 8 seconds. I'm trying various index approaches, (ANALYZING after adding/dropping an index) with no luck, yet. Maybe I just haven't hit on the right combination of fields to index on? If anybody can either (a) point me in a better direction, or (b) confirm my approach,I would greatly appreciate it Thanks very much. -- Edmund Bacon <ebacon@onesystem.com> ---------------------------(end of broadcast)--------------------------- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.orgso that your message can get through to the mailing list cleanly
В списке pgsql-sql по дате отправления: