Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
От | Bossart, Nathan |
---|---|
Тема | Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work |
Дата | |
Msg-id | 21F86990-CC52-42B2-8A85-A3DA110F57F1@amazon.com обсуждение исходный текст |
Ответ на | Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work
|
Список | pgsql-hackers |
I took the liberty of adjusting Bharath's patch based on the latest feedback. On 1/19/22, 10:35 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > We should be fsync'ing the directory not the files > themselves, no? I added a directory sync at the end of CheckPointLogicalRewriteHeap(), which IIUC is enough. > * Why is the check for "map-" prefix after, rather than before, > the lstat? I swapped these checks. I stopped short of moving the sscanf() before the lstat(), though, as 1) I don't think it will help very much and 2) it seemed weird to start emitting "could not parse filename" logs for non-regular files we presently skip silently. > * Why is it okay to ignore lstat failure? Seems like we might > as well not even have the lstat. I added error checking for lstat(). > * The sscanf on the file name would not notice trailing junk, > such as an editor backup marker. Is that okay? I think this is okay. The absolute worst that would happen would be that the extra file would be deleted. This might eventually become a problem if files with the same prefix format were created by the server. However, CheckPointSnapBuild() already has this problem with temporary files, and it claims not to need any extra handling: * temporary filenames from SnapBuildSerialize() include the LSN and * everything but are postfixed by .$pid.tmp. We can just remove them * the same as other files because there can be none that are * currently being written that are older than cutoff. > (Actually, isn't the > separate check for "map-" useless, given that sscanf will make > the equivalent check?) The only benefit I see from the extra "map-" check is that it'll avoid "could not parse filename" logs for files that clearly aren't related to the task at hand. I don't know if this is expected during normal operation at the moment. I've left the "map-" check for now. > I started out wondering why the patch didn't also change the loop's > other ERROR conditions to LOG. But we do want to ERROR if we're > unable to sync transient state down to disk, and that is what > the other steps (think they) are doing. It might be worth a > comment to point that out though, before someone decides that > if these errors are just LOG level then the others can be too. I added such a comment. I also updated CheckPointSnapBuild() and UpdateLogicalMappings() with similar adjustments. Nathan
Вложения
В списке pgsql-hackers по дате отправления: