Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call
От | Thomas Munro |
---|---|
Тема | Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call |
Дата | |
Msg-id | CA+hUKGK5SrF5A1bYsVyHYyF0L1u_PGse4My7wOE5CkscV3tjeA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call
|
Список | pgsql-hackers |
On Tue, Apr 25, 2023 at 12:16 PM Andres Freund <andres@anarazel.de> wrote: > On 2023-04-24 15:32:25 -0700, Andres Freund wrote: > > We obviously can add a retry loop to FileFallocate(), similar to what's > > already present e.g. in FileRead(). But I wonder if we shouldn't go a bit > > further, and do it for all the fd.c routines where it's remotely plausible > > EINTR could be returned? It's a bit silly to add EINTR retries one-by-one to > > the functions. > > > > > > The following are documented to potentially return EINTR, without fd.c having > > code to retry: > > > > - FileWriteback() / pg_flush_data() > > - FileSync() / pg_fsync() > > - FileFallocate() > > - FileTruncate() > > > > With the first two there's the added complication that it's not entirely > > obvious whether it'd be better to handle this in File* or pg_*. I'd argue the > > latter is a bit more sensible? > > A prototype of that approach is attached. I pushed the retry handling into the > pg_* routines where applicable. I guess we could add pg_* routines for > FileFallocate(), FilePrewarm() etc as well, but I didn't do that here. One problem we ran into with the the shm_open() case (which is nearly identical under the covers, since shm_open() just opens a file in a tmpfs on Linux) was that a simple retry loop like this could never terminate if the process was receiving a lot of signals from the recovery process, which is why we went with the idea of masking signals instead. Eventually we should probably grow the file in smaller chunks with a CFI in between so that we both guarantee that we make progress (by masking for smaller size increases) and service interrupts in a timely fashion (by unmasking between loops). I don't think that applies here because we're not trying to fallocate humongous size increases in one go, but I just want to note that we're making a different choice. I think this looks reasonable for the use cases we actually have here.
В списке pgsql-hackers по дате отправления: