Обсуждение: Question on win32 semaphore simulation
As I reviewed the win32/sema.c, there is some code that I am not clear, can anybody explain please? In semctl(SETVAL): if (semun.val < sem_counts[semNum]) sops.sem_op = -1; else sops.sem_op = 1; /* Quickly lock/unlock the semaphore (if we can) */ if (semop(semId, &sops, 1) < 0) return -1; When semun.val < sem_counts[semNum], it means we want to set the semaphore to semun.val, but because somebody ReleaseSemaphore() for serveral times, so we should wait for this semaphore several times (i.e., sem_counts[semNum] - semun.val) to recover it. When semun.val > sem_counts[semNum], we should ReleaseSemaphore() serveral times to recovery it. That is, should the sem_op assignment logic be: sops.sem_op = semun.val - sem_counts[semNum]; Of course, this would require we add a loop logic in semop(). Regards, Qingqing
"Qingqing Zhou" <zhouqq@cs.toronto.edu> wrote > As I reviewed the win32/sema.c, there is some code that I am not clear, can > anybody explain please? > There is another problem related to concurrent operations on win32 sema. Say two processes are doing semop(+1) concurrently. Look at this code: /* Don't want the lock anymore */ sem_counts[sops[0].sem_num]++; ReleaseSemaphore(cur_handle, sops[0].sem_op, NULL); Except for the problem mentioned in the above thread that the first line should be: sem_counts[sops[0].sem_num] += sops[0].sem_op, the sem_counts[] are unprotected by anything, so we might lose an update. Maybe I totally misunderstand something? Regards, Qingqing
> > As I reviewed the win32/sema.c, there is some code that I am not > > clear, > can > > anybody explain please? > > > > There is another problem related to concurrent operations on > win32 sema. Say two processes are doing semop(+1) > concurrently. Look at this code: > > /* Don't want the lock anymore */ > sem_counts[sops[0].sem_num]++; > ReleaseSemaphore(cur_handle, sops[0].sem_op, NULL); > > Except for the problem mentioned in the above thread that the > first line should be: sem_counts[sops[0].sem_num] += > sops[0].sem_op, the sem_counts[] are unprotected by anything, > so we might lose an update. Maybe I totally misunderstand something? I've never really looked intot eh semaphore stuff, but if sem_counts[] is in shared memory it should definitly be protected. Looking at the code, it looks fairly complex to me. I don't really know how sysv semaphores are supposed to work, or how we use them, but perhaps the whole piece of code can be simplified? //Magnus
"Magnus Hagander" <mha@sollentuna.net> writes: > Looking at the code, it looks fairly complex to me. I don't really know > how sysv semaphores are supposed to work, or how we use them, but > perhaps the whole piece of code can be simplified? I'm not sure why the win32 port chose to emulate the SysV semaphore interface anyway. You could equally well have used the Posix interface (src/backend/port/posix_sema.c). Or, given Microsoft's NIH tendencies, you might have needed to write a third implementation of the pg_sema.h interface ... but it'd likely still be no larger than win32/sema.c ... regards, tom lane
> > Looking at the code, it looks fairly complex to me. I don't really > > know how sysv semaphores are supposed to work, or how we > use them, but > > perhaps the whole piece of code can be simplified? > > I'm not sure why the win32 port chose to emulate the SysV > semaphore interface anyway. You could equally well have used > the Posix interface (src/backend/port/posix_sema.c). Or, > given Microsoft's NIH tendencies, you might have needed to > write a third implementation of the pg_sema.h interface ... > but it'd likely still be no larger than win32/sema.c ... I think that's a leftover from that code coming from a time when we didn't have an abstraction for semaphores. Specifically, it may have come out of the peerdirect port with was IIRC 7.3. Going with the third option might be a good idea - win32 *does* have native semaphores, and most of the work appears to be first adapting our need to sysv, then adapting sysv to win32. Worth looking at I guess. //Magnus
""Magnus Hagander"" <mha@sollentuna.net> wrote > > > > I'm not sure why the win32 port chose to emulate the SysV > > semaphore interface anyway. You could equally well have used > > the Posix interface (src/backend/port/posix_sema.c). Or, > > given Microsoft's NIH tendencies, you might have needed to > > write a third implementation of the pg_sema.h interface ... > > but it'd likely still be no larger than win32/sema.c ... > > Going with the third option might be a good idea - win32 *does* have > native semaphores, and most of the work appears to be first adapting our > need to sysv, then adapting sysv to win32. Worth looking at I guess. > Emulating the posix interface has almost the same difficulty as SysV for two reasons: (1) we don't have to emulate everything of SysV. For example, the "nops" parameter of semop() since we don't use it; (2) the killer function is PGSemaphoreReset(). There is no direct function for this in Win32 either. So we might want to fix current win32/sema.c for two problems: (1) semctl(SETVAL, val=0) - there is no other "val" than zero is used; (2) concurrent access to sem_counts[]; For problem 1, we can do it by locking the whole session of semaphore operation (which is the saftest way, but it is performance bad enough), or we could just do it as the posix_sema.c/PGSemaphoreReset does (which is the easiest way - just get rid of the EAGAIN report if we can't get it). But both of them to me, too bad - because I don't really understand why there is a semctl(SETVAL, val) in SysV anyway - IMHO this does not make any sense - even if you SETVAL to some value, you are not guranteed that you will stll get this value just after the function return. The UnlockBuffers()/UnpinBuffer() fix several days ago have proved this. In short, if we can get rid of this usage, both POSIX and Win32 will be happy to see it. Regards, Qingqing
"Qingqing Zhou" <zhouqq@cs.toronto.edu> wrote > > So we might want to fix current win32/sema.c for two problems: > (1) semctl(SETVAL, val=0) - there is no other "val" than zero is used; > (2) concurrent access to sem_counts[]; > Attached is a patch for the above proposed change -- but still, I hope we don't need semctl(SETVAL) at all. Regards, Qingqing --------- Index: sema.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/port/win32/sema.c,v retrieving revision 1.13 diff -c -r1.13 sema.c *** sema.c 9 Apr 2006 19:21:34 -0000 1.13 --- sema.c 19 Apr 2006 03:56:55 -0000 *************** *** 42,48 **** /* Fix the count of all sem of the pool to semun.array */ if (flag == SETALL) { ! int i; struct sembuf sops; sops.sem_flg = IPC_NOWAIT; --- 42,49 ---- /* Fix the count of all sem of the pool to semun.array */ if (flag == SETALL) { ! int i; ! int errStatus; struct sembuf sops; sops.sem_flg = IPC_NOWAIT; *************** *** 60,66 **** sops.sem_num = i; /* Quickly lock/unlock the semaphore (if we can) */ ! if (semop(semId, &sops, 1) < 0) return -1; } return 1; --- 61,72 ---- sops.sem_num = i; /* Quickly lock/unlock the semaphore (if we can) */ ! do ! { ! errStatus = semop(semId, &sops, 1); ! } while (errStatus < 0 && errno == EINTR); ! ! if (errStatus < 0) return -1; } return 1; *************** *** 72,88 **** if (semun.val != sem_counts[semNum]) { struct sembuf sops; sops.sem_flg = IPC_NOWAIT; sops.sem_num = semNum; ! ! if (semun.val < sem_counts[semNum]) ! sops.sem_op = -1; ! else ! sops.sem_op = 1; /* Quickly lock/unlock the semaphore (if we can) */ ! if (semop(semId, &sops, 1) < 0) return -1; } --- 78,96 ---- if (semun.val != sem_counts[semNum]) { struct sembuf sops; + int errStatus; sops.sem_flg = IPC_NOWAIT; sops.sem_num = semNum; ! sops.sem_op = semun.val - sem_counts[semNum]; /* Quickly lock/unlock the semaphore (if we can) */ ! do ! { ! errStatus = semop(semId, &sops, 1); ! } while (errStatus < 0 && errno == EINTR); ! ! if (errStatus < 0) return -1; } *************** *** 226,269 **** cur_handle = sem_handles[sops[0].sem_num]; ! if (sops[0].sem_op == -1) { DWORD ret; HANDLE wh[2]; wh[0] = cur_handle; wh[1] = pgwin32_signal_event; ! ret = WaitForMultipleObjectsEx(2, wh, FALSE, (sops[0].sem_flg & IPC_NOWAIT) ? 0 : INFINITE, TRUE); ! ! if (ret == WAIT_OBJECT_0) { ! /* We got it! */ ! sem_counts[sops[0].sem_num]--; ! return 0; } ! else if (ret == WAIT_OBJECT_0 + 1 || ret == WAIT_IO_COMPLETION) ! { ! /* Signal event is set - we have a signal to deliver */ ! pgwin32_dispatch_queued_signals(); ! errno = EINTR; ! } ! else if (ret == WAIT_TIMEOUT) ! /* Couldn't get it */ ! errno = EAGAIN; ! else ! errno = EIDRM; } ! else if (sops[0].sem_op > 0) { /* Don't want the lock anymore */ ! sem_counts[sops[0].sem_num]++; ReleaseSemaphore(cur_handle, sops[0].sem_op, NULL); return 0; } - else - /* Not supported */ - errno = ERANGE; /* If we get down here, then something is wrong */ return -1; --- 234,295 ---- cur_handle = sem_handles[sops[0].sem_num]; ! if (sops[0].sem_op < 0) { DWORD ret; HANDLE wh[2]; + int i; wh[0] = cur_handle; wh[1] = pgwin32_signal_event; ! /* ! * Try to consume the specified sem count. If we can't, we just ! * quit the operation silently because it is possible there is ! * another process just did some semop(-k) during our loop. ! */ ! errno = 0; ! for (i = 0; i < -(sops[0].sem_op); i++) { ! ret = WaitForMultipleObjectsEx(2, wh, FALSE, ! (sops[0].sem_flg & IPC_NOWAIT) ? 0 : INFINITE, TRUE); ! ! if (ret == WAIT_OBJECT_0) ! { ! /* We got it! */ ! InterlockedDecrement((volatile long *)(sem_counts + sops[0].sem_num)); ! } ! else if (ret == WAIT_OBJECT_0 + 1 || ret == WAIT_IO_COMPLETION) ! { ! /* Signal event is set - we have a signal to deliver */ ! pgwin32_dispatch_queued_signals(); ! errno = EINTR; ! } ! else if (ret == WAIT_TIMEOUT) ! /* Couldn't get it */ ! break; ! else ! errno = EIDRM; ! ! /* return immediately on error */ ! if (errno != 0) ! break; } ! ! /* successfully done */ ! if (errno == 0) ! return 0; } ! else { + Assert(sops[0].sem_op > 0); + /* Don't want the lock anymore */ ! InterlockedExchangeAdd((volatile long *) ! (sem_counts + sops[0].sem_num), sops[0].sem_op); ReleaseSemaphore(cur_handle, sops[0].sem_op, NULL); return 0; } /* If we get down here, then something is wrong */ return -1;
"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes: > (2) the killer function is PGSemaphoreReset(). There is no direct function > for this in Win32 either. If you can do PGSemaphoreTryLock, then Reset need only be a loop around it (cf. posix_sema.c). In current usage Reset doesn't have to be very efficient at all, because it's only used during backend startup to bring the semaphore to a known state. > (1) semctl(SETVAL, val=0) - there is no other "val" than zero is used; Really? Better look again. If you think the SysV interface is baroque (which I don't disagree with), then you should just get rid of it entirely and implement pg_sema.h directly atop the Windows primitives. I don't have a lot of sympathy for "let's implement just part of SysV because I don't like that other part". There is no contract saying that sysv_sema.c might not start using SysV features it doesn't use today. regards, tom lane
> > (2) the killer function is PGSemaphoreReset(). There is no direct > > function for this in Win32 either. > > If you can do PGSemaphoreTryLock, then Reset need only be a > loop around it (cf. posix_sema.c). In current usage Reset > doesn't have to be very efficient at all, because it's only > used during backend startup to bring the semaphore to a known state. > > > (1) semctl(SETVAL, val=0) - there is no other "val" than > zero is used; > > Really? Better look again. > > If you think the SysV interface is baroque (which I don't > disagree with), then you should just get rid of it entirely > and implement pg_sema.h directly atop the Windows primitives. > I don't have a lot of sympathy for "let's implement just > part of SysV because I don't like that other part". There is > no contract saying that sysv_sema.c might not start using > SysV features it doesn't use today. That's what I was thinking when I said "option 3". It shouldn't be *too* hard, and much cleaner. //Magnus