Обсуждение: client-side fsync() error handling

Поиск
Список
Период
Сортировка

client-side fsync() error handling

От
Peter Eisentraut
Дата:
Continuing the discussion from [0], there are still a number of fsync() 
calls in client programs that are unchecked or where errors are treated 
non-fatally.

Digging around through the call stack, I think changing fsync_fname() to 
just call exit(1) on errors instead of proceeding would address most of 
that.

This affects (at least) initdb, pg_basebackup, pg_checksums, pg_dump, 
pg_dumpall, and pg_rewind.

Thoughts?


[0]: 
https://www.postgresql.org/message-id/flat/9b49fe44-8f3e-eca9-5914-29e9e99030bf%402ndquadrant.com

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: client-side fsync() error handling

От
Michael Paquier
Дата:
On Tue, Feb 11, 2020 at 09:22:54AM +0100, Peter Eisentraut wrote:
> Digging around through the call stack, I think changing fsync_fname() to
> just call exit(1) on errors instead of proceeding would address most of
> that.
>
> Thoughts?

Doing things as you do in your patch sounds fine to me for this part.
Now, don't we need to care about durable_rename() and make the
panic-like failure an optional choice?  From what I can see, this
routine is used now in the backend for pg_basebackup to rename
temporary history files or partial WAL segments.
--
Michael

Вложения

Re: client-side fsync() error handling

От
Peter Eisentraut
Дата:
On 2020-02-12 06:28, Michael Paquier wrote:
> Now, don't we need to care about durable_rename() and make the
> panic-like failure an optional choice?  From what I can see, this
> routine is used now in the backend for pg_basebackup to rename
> temporary history files or partial WAL segments.

durable_rename() calls fsync_fname(), so it would be covered by this 
change.  The other file access calls in there can be handled by normal 
error handling, I think.  Is there any specific scenario you have in mind?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: client-side fsync() error handling

От
Michael Paquier
Дата:
On Thu, Feb 13, 2020 at 10:02:31AM +0100, Peter Eisentraut wrote:
> On 2020-02-12 06:28, Michael Paquier wrote:
>> Now, don't we need to care about durable_rename() and make the
>> panic-like failure an optional choice?  From what I can see, this
>> routine is used now in the backend for pg_basebackup to rename
>> temporary history files or partial WAL segments.
>
> durable_rename() calls fsync_fname(), so it would be covered by this change.
> The other file access calls in there can be handled by normal error
> handling, I think.  Is there any specific scenario you have in mind?

The old file flush is handled by your patch, but not the new one if
it exists, and it seems to me that we should handle failures
consistently to reason easier about it, actually as the top of the
function says :)

Another point that we could consider is if fsync_fname() should have
an option to not trigger an immediate exit when facing a failure.  The
backend has that option thanks to fsync_fname_ext() with its elevel
argument.  Your choice to default to a failure is fine for most cases
because that's what we want.  However, I am questioning if this change
would be surprising for some client applications or not, and if we
should have the option to choose one behavior or the other.
--
Michael

Вложения

Re: client-side fsync() error handling

От
Peter Eisentraut
Дата:
On 2020-02-13 12:52, Michael Paquier wrote:
>> durable_rename() calls fsync_fname(), so it would be covered by this change.
>> The other file access calls in there can be handled by normal error
>> handling, I think.  Is there any specific scenario you have in mind?
> 
> The old file flush is handled by your patch, but not the new one if
> it exists, and it seems to me that we should handle failures
> consistently to reason easier about it, actually as the top of the
> function says :)

OK, added in new patch.

> Another point that we could consider is if fsync_fname() should have
> an option to not trigger an immediate exit when facing a failure.  The
> backend has that option thanks to fsync_fname_ext() with its elevel
> argument.  Your choice to default to a failure is fine for most cases
> because that's what we want.  However, I am questioning if this change
> would be surprising for some client applications or not, and if we
> should have the option to choose one behavior or the other.

The option in the backend is between panicking and retrying.  The old 
behavior was to always retry but we have learned that that usually 
doesn't work.

The frontends do neither right now, or at least the error handling is 
very inconsistent and inscrutable.  It would be possible in theory to 
add a retry option, but that would be a very different patch, and given 
what we have learned about fsync(), it probably wouldn't be widely useful.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: client-side fsync() error handling

От
Michael Paquier
Дата:
On Thu, Feb 20, 2020 at 10:10:11AM +0100, Peter Eisentraut wrote:
> OK, added in new patch.

Thanks, that looks good.

> The frontends do neither right now, or at least the error handling is very
> inconsistent and inscrutable.  It would be possible in theory to add a retry
> option, but that would be a very different patch, and given what we have
> learned about fsync(), it probably wouldn't be widely useful.

Perhaps.  Let's have this discussion later if there are drawbacks
about changing things the way your patch does.  If we don't do that,
we'll never know about it either and this patch makes things safer.
--
Michael

Вложения

Re: client-side fsync() error handling

От
Peter Eisentraut
Дата:
On 2020-02-21 05:18, Michael Paquier wrote:
> On Thu, Feb 20, 2020 at 10:10:11AM +0100, Peter Eisentraut wrote:
>> OK, added in new patch.
> 
> Thanks, that looks good.

committed

>> The frontends do neither right now, or at least the error handling is very
>> inconsistent and inscrutable.  It would be possible in theory to add a retry
>> option, but that would be a very different patch, and given what we have
>> learned about fsync(), it probably wouldn't be widely useful.
> 
> Perhaps.  Let's have this discussion later if there are drawbacks
> about changing things the way your patch does.  If we don't do that,
> we'll never know about it either and this patch makes things safer.

yup

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: client-side fsync() error handling

От
Michael Paquier
Дата:
On Mon, Feb 24, 2020 at 05:03:07PM +0100, Peter Eisentraut wrote:
> committed

Thanks!
--
Michael

Вложения