Re: Query optimization question
От | Joe Conway |
---|---|
Тема | Re: Query optimization question |
Дата | |
Msg-id | 3DD99AF8.7000704@joeconway.com обсуждение исходный текст |
Ответ на | Query optimization question ("Daniel S. Myers" <dmyers@pomona.edu>) |
Список | pgsql-general |
Daniel S. Myers wrote: > Hi, > I’m working on a large-scale simulation study in biology, and > I’m using PostgreSQL as the hub of a distributed computing system. > Essentially, I have a single table containing 2.7 million rows, each of > which represents a work unit to be processed (think SETI@Home). Among > other fields, each row in the table contains a unique ID and a status, > which is one of WAIT (not yet processed) OUT (processing) or DONE > (completed). To dispatch a unit to a client, I’d like to pick a row with > status = WAIT, mark it as OUT, and return the values. The problem I have > is that finding a row takes a really long time (~22s on a 2-way PIII-700 > running Linux 2.4.19). My SQL looks like: select min(id) from analyses > where status=’WAIT’. I have indexes on the id field and the status > field, but an explain shows that they’re not being used—I’m assuming > that the optimizer has decided that there’s no point in using the index, > since it doesn’t sufficiently limit the scope of the search. Is there > some optimization that I’m missing (I’ve tried indexes on status/id and > id/status as well as the VACUUM and CLUSTER), or will I have to do > something more than the naïve approach? (We’re actually sampling 2700 > points 1000 times, so I can use replicates_executed counters in each row > and have a separate results table if I have to; it’d just be nice to > keep things simple if possible). > You might try creating a partial index (see the CREATE INDEX page in the documentation -- basically an index built on just part of a table qualified by a where clause) on id where status='WAIT'. I've never actually used one myself, but it seems like it might help in your situation. Also, instead of min(id), try: select id from analyses where status=’WAIT’ order by id limit 1; For reasons discussed before on the list (please search the archives), min() won't use your index, but ORDER BY ... LIMIT can. HTH, Joe
В списке pgsql-general по дате отправления: