Re: Monitoring time of fsyncing WALs
От | Michael Paquier |
---|---|
Тема | Re: Monitoring time of fsyncing WALs |
Дата | |
Msg-id | 20180701032927.GA2109@paquier.xyz обсуждение исходный текст |
Ответ на | Re: Monitoring time of fsyncing WALs (Michael Paquier <michael@paquier.xyz>) |
Ответы |
Re: Monitoring time of fsyncing WALs
|
Список | pgsql-hackers |
On Fri, Jun 29, 2018 at 10:39:00PM +0900, Michael Paquier wrote: > On Fri, Jun 29, 2018 at 08:48:33AM -0400, Robert Haas wrote: >> Are there other instances of fsync() that also need to be covered? > > Yeah, I double-checked the calls to pg_fsync back when I looked at this > patch, but now that I look again I see at least two more of them: > - fsync_fname_ext. This one is used internally by things like durable_rename or such, on which I am not sure that we actually can pass down correctly a wait event as multiple syncs may happen within each call, so that would require a weird API layer. > - write_auto_conf_file, where a wait event for a write() is missing as > well. I have been looking at the archives and this has been left out on purpose: https://postgr.es/m/CAGPqQf0bzVfTTZdxcr4qHb3WDbn=+iH6sWchbN4HGjhwtcbPYQ@mail.gmail.com > Hm. I am wondering if it would not be worth extending pg_fsync() with > an argument for a wait event and introduce a sort of pg_write() which > wraps write() with an extra wait event argument, and similarly for > read() (warning, conflict with Windows ahead!). That's somehow related > to the feeling I had when working with transient file's writes and reads > a couple of days back. Those are most likely going to be forgotten > again and again in the future. In both cases the caller would still be > responsible for looking at errno and decide the error handling, but I > got no feedback about the idea on transient files. That would finish by being sort of ugly as durable_rename or such would also need some treatment. That would be quite invasive. > There are also a couple of places where the wait events are longer than > they should. For example in writeTimeLineHistory, there is a wait event > for write() which is still switched on even within an error code path. > And on top of it I think that a call to pgstat_report_wait_end() is > missing in the error code path as on ERROR the session still exists. > That's a bug. Those need an extra lookup and visibly some cleanup back > to v10. On ERROR a backend would switch back quickly on ClientRead. Perhaps we'd want to call pgstat_report_wait_end() when an error or higher is thrown? That's a different discussion. So at the end, I would like to use the proposed patch and call it a day. Thoughts? -- Michael
Вложения
В списке pgsql-hackers по дате отправления: