Re: Streaming I/O, vectored I/O (WIP)
От | Thomas Munro |
---|---|
Тема | Re: Streaming I/O, vectored I/O (WIP) |
Дата | |
Msg-id | CA+hUKGJywvC5J+rcLs3+whK8YPBnKXVLn5o0JFJbNeb4CEdSFA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Streaming I/O, vectored I/O (WIP) (Heikki Linnakangas <hlinnaka@iki.fi>) |
Ответы |
Re: Streaming I/O, vectored I/O (WIP)
|
Список | pgsql-hackers |
On Wed, Nov 29, 2023 at 1:44 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > LGTM. I think this 0001 patch is ready for commit, independently of the > rest of the patches. Done. > In v2-0002-Provide-vectored-variants-of-FileRead-and-FileWri-1.patch, fd.h: > > > +/* Filename components */ > > +#define PG_TEMP_FILES_DIR "pgsql_tmp" > > +#define PG_TEMP_FILE_PREFIX "pgsql_tmp" > > + > > These seem out of place, we already have them in common/file_utils.h. Yeah, they moved from there in f39b2658 and I messed up the rebase. Fixed. > Other than that, > v2-0002-Provide-vectored-variants-of-FileRead-and-FileWri-1.patch and > v2-0003-Provide-vectored-variants-of-smgrread-and-smgrwri.patch look > good to me. One thing I wasn't 100% happy with was the treatment of ENOSPC. A few callers believe that short writes set errno: they error out with a message including %m. We have historically set errno = ENOSPC inside FileWrite() if the write size was unexpectedly small AND the kernel didn't set errno to a non-zero value (having set it to zero ourselves earlier). In FileWriteV(), I didn't want to do that because it is expensive to compute the total write size from the vector array and we managed to measure an effect due to that in some workloads. Note that the smgr patch actually handles short writes by continuing, instead of raising an error. Short writes do already occur in the wild on various systems for various rare technical reasons other than ENOSPC I have heard (imagine transient failure to acquire some temporary memory that the kernel chooses not to wait for, stuff like that, though certainly many people and programs believe they should not happen[1]), and it seems like a good idea to actually handle them as our write sizes increase and the probability of short writes might presumably increase. With the previous version of the patch, we'd have to change a couple of other callers not to believe that short writes are errors and set errno (callers are inconsistent on this point). I don't really love that we have "fake" system errors but I also want to stay focused here, so in this new version V3 I tried a new approach: I realised I can just always set errno without needing the total size, so that (undocumented) aspect of the interface doesn't change. The point being that it doesn't matter if you clobber errno with a bogus value when the write was non-short. Thoughts? [1] https://utcc.utoronto.ca/~cks/space/blog/unix/WritesNotShortOften
Вложения
В списке pgsql-hackers по дате отправления: