Select Last n Rows Matching an Index Condition (and caches)
От | Alex Stapleton |
---|---|
Тема | Select Last n Rows Matching an Index Condition (and caches) |
Дата | |
Msg-id | CMEKJGNLDMNLHLKAEDDEKELLHHAA.alexs@advfn.com обсуждение исходный текст |
Ответы |
Re: Select Last n Rows Matching an Index Condition (and
Re: Select Last n Rows Matching an Index Condition (and caches) Re: Select Last n Rows Matching an Index Condition (and caches) |
Список | pgsql-general |
We have a ~10million row table but are expecting it to get larger, possibly by a factor of 10 or more. The rows are quite long and fixed length (just over 500 bytes.) We have an index of (symbol, source, date) on this table and doing queries like this SELECT * FROM article WHERE symbol=12646 AND source = 19 ORDER BY time DESC LIMIT 1000; To get the latest 1000 rows for that symbol and source. However this takes quite a while at the best of times, (1-10 seconds.) The query without the order by and the limit tends to return about 70000 rows which adds up to about 30MB of data. Once the pages are in the cache they take around 100ms but this is to be expected. Unfortunately the initial query required to cache it is unnacceptably long for web application like ours. My (not yet implemented) solution to this problem is to add a SEQUENCE and index it so that by adding a WHERE id > [max_id]-1000 and ordering by time DESC will reduce the I/O quite a lot. Am I right here? It would be nice if there was a way to get PostgreSQL to try and precache the tables pages as well, is there anyway I could achieve something like that? I have toyed with creating a ramdisk to store a lookup table of sorts on (we only care about a few columns initially) to speed this up a bit but its a right pain in the arse to do by the looks of things.
В списке pgsql-general по дате отправления: