Re: WAL replay should fdatasync() segments?
От | Andres Freund |
---|---|
Тема | Re: WAL replay should fdatasync() segments? |
Дата | |
Msg-id | 20140122170828.GB30218@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: WAL replay should fdatasync() segments? (Fujii Masao <masao.fujii@gmail.com>) |
Ответы |
Re: WAL replay should fdatasync() segments?
|
Список | pgsql-hackers |
On 2014-01-23 02:05:48 +0900, Fujii Masao wrote: > On Thu, Jan 23, 2014 at 1:21 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > Hi, > > > > Currently, XLogInsert(), XLogFlush() or XLogBackgroundFlush() will > > write() data before fdatasync()ing them (duh, kinda obvious). But I > > think given the current recovery code that leaves a window where we can > > get into strange inconsistencies. > > Consider what happens if postgres (not the OS!) crashes after writing > > WAL data to the OS, but before fdatasync()ing it. Replay will happily > > read that record from disk and replay it, which is fine. At the end of > > recovery we then will start inserting new records, and those will be > > properly fsynced to disk. > > But if the *OS* crashes in that moment we might get into the strange > > situation where older records might be lost since they weren't > > fsync()ed, but newer records and the control file will persist. > > > > I think for a primary that window is relatively small, but I think it's > > a good bit bigger for a standby, especially if it's promoted. > > In normal streaming replication case, ISTM that window is not bigger for > the standby because basically the standby replays only the WAL data > which walreceiver fsync'd to the disk. But if it replays the WAL file which > was fetched from the archive, that WAL file might not have been flushed > to the disk yet. In this case, that window might become bigger... Yea, but if the walreceiver receives data and crashes/disconnects before fsync(), we'll read it from pg_xlog, rigth? And if we promote, we'll start inserting new records before establishing a new checkpoint. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: