Re: Serialization, Locking...implement processing Queue with a table
От | Tom Lane |
---|---|
Тема | Re: Serialization, Locking...implement processing Queue with a table |
Дата | |
Msg-id | 8777.1052749406@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Serialization, Locking...implement processing Queue with a table ("D. Dante Lorenso" <dante@lorenso.com>) |
Список | pgsql-general |
"D. Dante Lorenso" <dante@lorenso.com> writes: > How should I go about implementing a synchronized process id > queue that will select one unique row from a table at a time > and make the selection be safe with concurrent accesses? You can find various discussions of this in the archives, but a reasonable way to proceed is: 1. The table of pending or in-process jobs has a column "processor_id" that is zero for pending jobs and equal to the (unique) processor number for active jobs. (Assume for the moment that completed jobs are removed from the table entirely.) 2. When idle, you try to reserve a job like so: BEGIN; SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; SELECT job_id, ... FROM job_table WHERE processor_id = 0 LIMIT 1 FOR UPDATE; The SELECT has three possible outcomes: 2a: One row is returned. You do UPDATE job_table SET processor_id = $me WHERE job_id = $jobid; COMMIT; and then go about executing the job. When done, delete the row from job_table and try to get another one. 2b: No row is returned: no jobs are pending. Commit your transaction, sleep for an appropriate delay period, and try again. 2c: You get a "can't serialize" failure. This will happen if two processors try to reserve the same row at the same time. In this case, roll back your transaction, sleep for a short interval (maybe a few msec) and try again. You don't want to sleep as long as normal in this case, since there might be another available job. (Note that you *cannot* do this in a plpgsql function, since it cannot start or commit a transaction; these commands have got to be directly issued by the application.) Assuming that there aren't a vast number of pending jobs at any time, this should work pretty well without even bothering with an index on job_table. You will want to vacuum it often though (at least every few hundred job completions, I'd think). Now, what if you wanted to remember completed jobs? I'd actually recommend transferring the records of completed jobs to a different table. But if you really want to keep them in the same table, maybe add a boolean "completed" field, and make the initial SELECT be SELECT job_id, ... FROM job_table WHERE processor_id = 0 AND NOT completed LIMIT 1 FOR UPDATE; Now you *will* need an index to keep things speedy. I'd try a partial index on processor_id with condition "NOT completed". You'll still need frequent vacuums. regards, tom lane
В списке pgsql-general по дате отправления: