Re: intermittant performance problem
От | Mike Charnoky |
---|---|
Тема | Re: intermittant performance problem |
Дата | |
Msg-id | 49B66576.9030402@nextbus.com обсуждение исходный текст |
Ответ на | Re: intermittant performance problem (Gregory Stark <stark@enterprisedb.com>) |
Список | pgsql-general |
Gregory Stark wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > >> Mike Charnoky <noky@nextbus.com> writes: >>> The sampling query which runs really slow on some days looks something >>> like this: >>> INSERT INTO sampled_data >>> (item_name, timestmp, ... ) >>> SELECT item_name, timestmp, ... ) >>> FROM raw_data >>> WHERE timestmp >= ? and timestmp < ? >>> AND item_name=? >>> AND some_data_field NOTNULL >>> ORDER BY random() >>> LIMIT ?; >> Hmph, I'd expect that that would run pretty slowly *all* the time :-(. >> There's no good way to optimize "ORDER BY random()". > > This seems kind of unlikely but does the parameter to the LIMIT vary a lot? If > it's small enough to fit all the chosen records in work_mem then you'll avoid > a disk-sort and do a top-k scan. If it overflows work_mem then it'll fail over > to do a full disk sort of all the records picked from raw_data. > > It does seem much more likely that whatever index you have it using on > timestmp or item_name or some_data_field is sometimes being used and sometimes > not. Perhaps it's switching from an index on one of those columns to an index > on some other column and that's what's throwing it off. The parameter used for the LIMIT does not vary too much. It is typically a couple thousand records that are selected. Judging by the disk IO monitoring we have in place, it does seem like a full disk-sort is being done when the query runs slow. Would you expect this action to totally hose overall database performance? I'm instrumenting the EXPLAIN now, I'll see what this turns up over the course of the week and will check back if I'm still stumped. Thanks. Mike
В списке pgsql-general по дате отправления: