Re: silent data loss with ext4 / all current versions
От | Michael Paquier |
---|---|
Тема | Re: silent data loss with ext4 / all current versions |
Дата | |
Msg-id | CAB7nPqQ01Wf9c24dntx-czOWqBwyamBdyhfirqE-tJS=XnSCDA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: silent data loss with ext4 / all current versions (Greg Stark <stark@mit.edu>) |
Ответы |
Re: silent data loss with ext4 / all current versions
|
Список | pgsql-hackers |
On Fri, Jan 22, 2016 at 9:41 PM, Greg Stark <stark@mit.edu> wrote: > On Fri, Jan 22, 2016 at 8:26 AM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> On 01/22/2016 06:45 AM, Michael Paquier wrote: >> >>> So, I have been playing with a Linux VM with VMware Fusion and on >>> ext4 with data=ordered the renames are getting lost if the root >>> folder is not fsync. By killing-9 the VM I am able to reproduce that >>> really easily. >> >> >> Yep. Same experience here (with qemu-kvm VMs). > > I still think a better approach for this is to run the database on an > LVM volume and take lots of snapshots. No VM needed, though it doesn't > hurt. LVM volumes are below the level of the filesystem and a snapshot > captures the state of the raw blocks the filesystem has written to the > block layer. The block layer does no caching though the drive may but > neither the VM solution nor LVM would capture that. > > LVM snapshots would have the advantage that you can keep running the > database and you can take lots of snapshots with relatively little > overhead. Having dozens or hundreds of snapshots would be unacceptable > performance drain in production but for testing it should be practical > and they take relatively little space -- just the blocks changed since > the snapshot was taken. Another idea: hardcode a PANIC just after rename() with restart_after_crash = off (this needs is IsBootstrapProcess() checks). Once server crashes, kill-9 the VM. Then restart the VM and the Postgres instance with a new binary that does not have the PANIC, and see how things are moving on. There is a window of up to several seconds after the rename() call, so I guess that this would work. -- Michael
В списке pgsql-hackers по дате отправления: