Re: PATCH: track last known XLOG segment in control file
От | Andres Freund |
---|---|
Тема | Re: PATCH: track last known XLOG segment in control file |
Дата | |
Msg-id | 20151212223948.GS14789@awork2.anarazel.de обсуждение исходный текст |
Ответ на | Re: PATCH: track last known XLOG segment in control file (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: PATCH: track last known XLOG segment in control file
|
Список | pgsql-hackers |
On 2015-12-12 23:28:33 +0100, Tomas Vondra wrote: > On 12/12/2015 11:20 PM, Andres Freund wrote: > >On 2015-12-12 22:14:13 +0100, Tomas Vondra wrote: > >>this is the second improvement proposed in the thread [1] about ext4 data > >>loss issue. It adds another field to control file, tracking the last known > >>WAL segment. This does not eliminate the data loss, just the silent part of > >>it when the last segment gets lost (due to forgetting the rename, deleting > >>it by mistake or whatever). The patch makes sure the cluster refuses to > >>start if that happens. > > > >Uh, that's fairly expensive. In many cases it'll significantly > >increase the number of fsyncs. > > It should do exactly 1 additional fsync per WAL segment. Or do you think > otherwise? Which is nearly doubling the number of fsyncs, for a good number of workloads. And it does so to a separate file, i.e. it's not like these writes and the flushes can be combined. In workloads where pg_xlog is on a separate partition it'll add the only source of fsyncs besides checkpoint to the main data directory. > > I've a bit of a hard time believing this'll be worthwhile. > > The trouble is protections like this only seem worthwhile after the fact, > when something happens. I think it's reasonable protection against issues > similar to the one I reported ~2 weeks ago. YMMV. Meh. That argument can be used to justify about everything. Obviously we should be more careful about fsyncing files, including the directories. I do plan come back to your recent patch. > > Additionally this doesn't seem to take WAL replay into account? > > I think the comparison in StartupXLOG needs to be less strict, to allow > cases when we actually replay more WAL segments. Is that what you mean? What I mean is that the value isn't updated during recovery, afaics. You could argue that minRecoveryPoint is that, in a way. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: