Creation of an empty table is not fsync'd at checkpoint
От | Heikki Linnakangas |
---|---|
Тема | Creation of an empty table is not fsync'd at checkpoint |
Дата | |
Msg-id | d47d8122-415e-425c-d0a2-e0160829702d@iki.fi обсуждение исходный текст |
Ответы |
Re: Creation of an empty table is not fsync'd at checkpoint
Re: Creation of an empty table is not fsync'd at checkpoint |
Список | pgsql-hackers |
If you create an empty table, it is not fsync'd. As soon as you insert a row to it, register_dirty_segment() gets called, and after that, the next checkpoint will fsync it. But before that, the creation itself is never fsync'd. That's obviously not great. The lack of an fsync is a bit hard to prove because it requires a hardware failure, or a simulation of it, and can be affected by filesystem options too. But I was able to demonstrate a problem with these steps: 1. Create a VM with two virtual disks. Use ext4, with 'data=writeback' option (I'm not sure if that's required). Install PostgreSQL on one of the virtual disks. 2. Start the server, and create a tablespace on the other disk: CREATE TABLESPACE foospc LOCATION '/data/heikki'; 3. Do this: CREATE TABLE foo (i int) TABLESPACE foospc; CHECKPOINT; 4. Immediately after that, kill the VM. I used: killall -9 qemu-system-x86_64 5. Restart the VM, restart PostgreSQL. Now when you try to use the table, you get an error: postgres=# select * from crashtest ; ERROR: could not open file "pg_tblspc/81921/PG_15_202201271/5/98304": No such file or directory I was not able to reproduce this without the tablespace on a different virtual disk, I presume because ext4 orders the writes so that the checkpoint implicitly always flushes the creation of the file to disk. I tried data=writeback but it didn't make a difference. But with a separate disk, it happens every time. I think the simplest fix is to call register_dirty_segment() from mdcreate(). As in the attached. Thoughts? - Heikki
Вложения
В списке pgsql-hackers по дате отправления: