Re: pgsql-server/src backend/storage/buffer/bufmgr ...
От | Bruce Momjian |
---|---|
Тема | Re: pgsql-server/src backend/storage/buffer/bufmgr ... |
Дата | |
Msg-id | 200401262010.i0QKAg328822@candle.pha.pa.us обсуждение исходный текст |
Ответ на | Re: pgsql-server/src backend/storage/buffer/bufmgr ... (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: pgsql-server/src backend/storage/buffer/bufmgr ...
|
Список | pgsql-committers |
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Tom Lane wrote: > >> As I've said before, I think we need to find a way to stop using sync() > >> altogether --- we have to move to fsync or O_SYNC and variants. sync > >> has simply got the wrong API. > > > If sync failes (kernel to disk write failes) we have a hardware failure, > > and we don't pretend to recover from that, > > Not necessarily --- it could be out-of-disk-space, on at least some > filesystems. More to the point, the important thing is not to commit a > checkpoint record to WAL indicating that everything is good, when > everything is not good. As long as we don't checkpoint we have some > hope of recovering automatically via WAL replay. > > > One idea I floated around was to > > open/write/fsync/close a temporary file after sync in the hope that it > > would happen after the sync completes because the fsync would be at the > > end of the disk flush queue. > > "In the hope"? We already have a guess-and-hope approach to this, and > it will never be any better as long as we use sync(), because sync() is > fundamentally the wrong operation. It doesn't tell you when the I/O is > done, and it doesn't tell you whether the I/O was done successfully, and > there is no possibility of working around that fundamental lack of > information except to stop using it. I guess my major problem with moving away from sync is similar to the reason we don't do raw devices --- sync is best done in the kernel and disk driver that knows more about how to do it efficiently. I haven't see any non-sync solution with performance similar to sync(). However, we are going to have to write one for win32, so we can test things once we are done and then decide. I think the win32 solution will be to record modified files in a central location, and have the checkpoint open/fsync(_commit), perhaps it all happening at the same time in different threads so it isn't serialized. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
В списке pgsql-committers по дате отправления: