Re: logical replication: could not create file "state.tmp": Fileexists
От | Andres Freund |
---|---|
Тема | Re: logical replication: could not create file "state.tmp": Fileexists |
Дата | |
Msg-id | 20191202161222.sazl2omhhk5pl3nl@alap3.anarazel.de обсуждение исходный текст |
Ответ на | logical replication: could not create file "state.tmp": File exists (Grigory Smolkin <g.smolkin@postgrespro.ru>) |
Ответы |
Re: logical replication: could not create file "state.tmp": Fileexists
Re: logical replication: could not create file "state.tmp": Fileexists Re: logical replication: could not create file "state.tmp": Fileexists |
Список | pgsql-bugs |
Hi, On 2019-11-30 15:09:39 +0300, Grigory Smolkin wrote: > One of my colleagues encountered an out of space condition, which broke his > logical replication setup. > It`s manifested with the following errors: > > ERROR: could not receive data from WAL stream: ERROR: could not create > file "pg_replslot/some_sub/state.tmp": File exists Hm. What was the log output leading to this state? Some cases of this would end up in a PANIC, which'd remove the .tmp file during recovery. But there's some where we won't - it seems the right fix for this would be to unlink the tmp file in that case? > I`ve digged a bit into this problem, and it`s turned out that in > SaveSlotToPath() temp file for replication slot is opened with 'O_CREAT | > O_EXCL' flags, which makes this routine as not very reentrant. > > Since an exclusive lock is taken before temp file creation, I think it > should be safe to replace O_EXCL with O_TRUNC. I'm very doubtful about this. I think it's a good safety measure to ensure that there's no previous state file that we're somehow overwriting. > Script to reproduce and patch are attached. Well: > # Imitate out_of_space/write_operation_error > touch ${PGDATA_PUB}/pg_replslot/mysub/state.tmp Doesn't really replicate how we got into this state... Greetings, Andres Freund
В списке pgsql-bugs по дате отправления: