Row locks, SKIP LOCKED, and transactions

Поиск
Список
Период
Сортировка
От Steven Winfield
Тема Row locks, SKIP LOCKED, and transactions
Дата
Msg-id E9FA92C2921F31408041863B74EE4C2001E7662416@CCPMAILDAG03.cantab.local
обсуждение исходный текст
Ответы Re: Row locks, SKIP LOCKED, and transactions  (Adrian Klaver <adrian.klaver@aklaver.com>)
Re: Row locks, SKIP LOCKED, and transactions  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-general
Hi all,

I'm seeing some unexpected behaviour with SELECT ... FOR UPDATE SKIP LOCKED, and having finding it tricky to boil it
downto a simple repro case as there's almost certainly a race condition somewhere (more later). So I thought I would
askif what I'm doing is unsupported (or just plain wrong!), before expending more effort in reproducing it. 

I'm running v11.5, RHEL 7.7.

I have two tables jobs and results:
CREATE TABLE job (id integer PRIMARY KEY);
CREATE TABLE result (id integer PRIMARY KEY);
(obviously the real tables have more columns, but that's not too important here)

Something populates the job table with IDs.
A job is done if its id appears in the result table.
I would like to have multiple worker processes working on jobs.


I thought I could achieve this with each working doing the following:

BEGIN;

SELECT id
FROM job
WHERE NOT EXISTS (SELECT 1 FROM result WHERE result.id = job.id)
LIMIT 1
FOR UPDATE SKIP LOCKED;

-- worker process does some work for the selected ID here

INSERT INTO result (id) VALUES (the_id_from_above);

COMMIT;


However, even with just two worker processes, I quickly found that one worker process would be assigned a job id that
had*very* recently been completed by the other worker. 

Some more potentially useful information:
* The LockRows node of the plan for the SELECT query above doesn't receive any tuples until about a second after the
querybegins executing 
* If worker 2 begins querying for a new job id half a second before worker 1 commits then worker 2 will pick up the job
idthat worker 1 has just finished with. 
* I observe this even if I crank up the transaction isolation level to repeatable read and serializable.


I'm wondering if row locks are not obeying the same transactional semantics as row data, as a potential explanation for
theabove behaviour is as follows (W1/2 = Worker 1/2): 

W1: BEGIN;
W1: SELECT ...
W1: (SELECT returns id=1. W1 now has job(id=1) locked.)
W1: INSERT INTO result (id) VALUES (1)

W2: BEGIN;
W2: SELECT ...

W1: COMMIT; job(id=1) is now unlocked.

W2: (SELECT returns id=1: W1 had not committed when the SELECT started, so result(id=1) is not visible, but LockRows
foundthat job(id=1) was not locked. W2 now has job(id=1) locked.) 


...i.e. W2's SELECT could not see the row that W1 INSERTed (because W2's BEGIN occurs and W2's SELECT begins before
W1'scommit), but W2's SELECT *could* see the removal of W1's row lock.  


Perhaps this is a misuse of the locking system, since I'm locking a row "FOR UPDATE" but not actually updating it, but
asrow locks are released at the end of a transaction (according to the docs) then my expectation was for the unlocking
andthe visibility of newly committed rows to be atomic. 
I've tried FOR NO KEY UPDATE too, without luck.

If I'm doing something forbidden (and the docs say so) then I'd be grateful if someone could point that out!

Best,
Steven.




В списке pgsql-general по дате отправления:

Предыдущее
От: Andrei Zhidenkov
Дата:
Сообщение: Re: Race condition while creating a new partition
Следующее
От: George Neuner
Дата:
Сообщение: Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?