Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
От | Bharath Rupireddy |
---|---|
Тема | Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work |
Дата | |
Msg-id | CALj2ACXzMf+8-KoREutAA40MrC4sMEY2P1aLFD_RYysC1M2CZQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
|
Список | pgsql-hackers |
On Wed, Feb 2, 2022 at 5:25 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Mon, Jan 31, 2022 at 10:42:54AM +0530, Bharath Rupireddy wrote: > > After an off-list discussion with Andreas, proposing here a patch that > > basically replaces ReadDir call with ReadDirExtended and gets rid of > > lstat entirely. With this chance, the checkpoint will only care about > > the snapshot and mapping files and not fail if it finds other files in > > the directories. Removing lstat enables us to make things faster as we > > avoid a bunch of extra system calls - one lstat call per each mapping > > or snapshot file. > > I think removing the lstat() is probably reasonable. We currently aren't > doing proper error checking, and the chances of a non-regular file matching > the prefix are likely pretty low. In the worst case, we'll LOG or ERROR > when unlinking or fsyncing fails. > > However, I'm not sure about the change to ReadDirExtended(). That might be > okay for CheckPointSnapBuild(), which is just trying to remove old files, > but CheckPointLogicalRewriteHeap() is responsible for ensuring that files > are flushed to disk for the checkpoint. If we stop reading the directory > after an error and let the checkpoint continue, isn't it possible that some > mappings files won't be persisted to disk? Unless I mis-read your above statement, with LOG level in ReadDirExtended, I don't think we stop reading the files in CheckPointLogicalRewriteHeap. Am I missing something here? Since, we also continue in CheckPointLogicalRewriteHeap if we can't parse/delete some files with the change of "could not parse filename"/"could not remove file" messages to LOG level I'm attaching v6, just changed elog(LOG, to ereport(LOG in CheckPointLogicalRewriteHeap, other things remain the same. Regards, Bharath Rupireddy.
Вложения
В списке pgsql-hackers по дате отправления: