Re: EINTR in ftruncate()
От | Andres Freund |
---|---|
Тема | Re: EINTR in ftruncate() |
Дата | |
Msg-id | 20220706203859.qbem3yjlvx2xbegc@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: EINTR in ftruncate() (Alvaro Herrera <alvherre@alvh.no-ip.org>) |
Ответы |
Re: EINTR in ftruncate()
|
Список | pgsql-hackers |
Hi, On 2022-07-06 21:29:41 +0200, Alvaro Herrera wrote: > On 2022-Jul-05, Andres Freund wrote: > > > I think we'd be better off disabling at least some signals during > > dsm_impl_posix_resize(). I'm afraid we'll otherwise just find another > > variation of these problems. I haven't checked the source of ftruncate, but > > what Thomas dug up for fallocate makes it pretty clear that our current > > approach of just retrying again and again isn't good enough. It's a bit more > > obvious that it's a problem for fallocate, but I don't think it's worth having > > different solutions for the two. > > So what if we move the retry loop one level up? As in the attached. > Here, if we get EINTR then we retry both syscalls. Doesn't really seem to address the problem to me. posix_fallocate() takes some time (~1s for 3GB roughly), so if we signal at a higher rate, we'll just get stuck. I hacked a bit on a test program from Thomas, and it's pretty clearly that with a 5ms timer interval you'll pretty much not make progress. It's much easier to get fallocate() to get interrupted than ftruncate(), but the latter gets interrupted e.g. when you do a strace in the "wrong" moment (afaics SIGSTOP/SIGCONT trigger EINTR in situations that are retried otherwise). So I think we need: 1) block most signals, 2) a retry loop *without* interrupt checks. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: