Re:
От | greg@turnstep.com |
---|---|
Тема | Re: |
Дата | |
Msg-id | 5658f4d18a1d78dc3b288c297a0c45b2@biglumber.com обсуждение исходный текст |
Ответ на | Re: ("Scott Morrison" <smorrison@navtechinc.com>) |
Список | pgsql-novice |
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > Well I ran the queries through some more tests and the original query is > still faster than any of the suggested queries. I finally have it figured out - it all depends on the distribution of the data. I was assuming that there were relatively few unique ids in each of the 200,000 rows. When this is true, my query runs fast. When there are few duplicate ids, and the number approaches the total number of rows, the original query (and Tom's rewrite) works much better. I was getting the opposite results as you: my query was fast, and yours would basically time out. When I rearranged the data to provide for more unique ids, this was reversed. Ohwell :) One minor tweak's to Tom's final query: EXPLAIN ANALYZE SELECT * FROM sample a WHERE date = (SELECT date FROM sample WHERE id=a.id AND date<='2003-02-07' ORDER BY id DESC, date DESC LIMIT 1); If you know that you are grabbing a certain percentage of the data, you can add another WHERE clause to speed things up: EXPLAIN ANALYZE SELECT * FROM sample a WHERE date <='2003-02-07' AND date = (SELECT date FROM sample WHERE id=a.id AND date<='2003-02-07' ORDER BY id DESC, date DESC LIMIT 1); This hits the sample_date index instead of doing a sequential scan of sample: and is probably faster for more cases. In my tests the two even out only when we start grabbing most of the rows: all dates earlier than 02-25, with a even sitribution of dates throughout 2/2003. - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200302111237 -----BEGIN PGP SIGNATURE----- Comment: http://www.turnstep.com/pgp.html iD8DBQE+STTuvJuQZxSWSsgRAgcyAJ4zuW6G/j2cvvjaPynKRyV7rsih6ACeNpkl UXerayY3r02qttNs6tTUMiw= =dlwt -----END PGP SIGNATURE-----
В списке pgsql-novice по дате отправления: