Batch processing select
От | Steve Crawford |
---|---|
Тема | Batch processing select |
Дата | |
Msg-id | 200411021530.10027.scrawford@pinpointresearch.com обсуждение исходный текст |
Список | pgsql-general |
I'm still mulling the best way to handle this situation. I have a table that describes work to be processed. This table includes a description of the work as well as priority and scheduling information (certain records can only be handled by certain client processes or at particular times of the day or week). I have several hundred client processes to handle the work, most, but not all, of which can handle any of the items in the database. When a process is free, it needs to return the results to the table (not an issue) but also needs to get new work assigned for processing (problem). I need to select one record from the table so the client program can process it. This record should be the highest priority item that the requesting client is able to process at that particular time. Of course, it can't be a record that has been completed or which is already being handled by another process. Performance is an issue. Each piece of work takes ~20-300 seconds to handle and the overall processing rate is ~10 items/second. The "to-do" table often exceeds 500,000 records. In my earlier attempt I tried "select ... for update where {record needs processing} limit 1; set status flag to in-progress;". Unfortunately for this purpose the second process hitting the DB will block and then return 0 records when the first process completes since the status-flag has changed to in-progress. I've considered "select ... for update where {record needs processing and tuple not locked} limit 1..." but don't know of a function that returns the lock status of a tuple. Any ideas of how I can attack this problem? Cheers, Steve
В списке pgsql-general по дате отправления: