Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows
От | Robert Haas |
---|---|
Тема | Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows |
Дата | |
Msg-id | CA+Tgmoa1CQ5cbUoOK9dfnofCSbu6NmBaYptJ57bTe41MECQ-+Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows
|
Список | pgsql-bugs |
On Fri, Oct 20, 2023 at 1:17 AM Thomas Munro <thomas.munro@gmail.com> wrote: > > I take that back: there is a palloc() under RelationTruncate(). DNLGTM. > > Ooops, I should have quoted to the 0002 patch there. > > Hmm. We could teach md.c to do all its path manipulation in > MAXPGPATH-sized output buffers but that still wouldn't be enough > because it might decide to allocate more space for the array of > segments, and I'm not even sure where in PostgreSQL we guarantee that > everything that could appear here would fit in that. But that gives > me an idea. It feels like a bit of a dirty hack, which could perhaps > be made less dirty with some kind of function that includes the word > 'ensure' in its name, but I think we can make md.c promise not to > allocate anything before the next CFI with something like this: Ugh. AFAICS the problems are confined to smgrtruncate() and within that to the callback to smgr_truncate. As far as the rest of smgrtruncate() is concerned, it seems like DropRelationBuffers() and CacheInvalidateSmgr() are safe enough in a critical section, and the other code directly inside RelationTruncate() looks OK, too. But within mdtruncate(), we've got more than one problem, I think. mdnblocks() is a problem because of the reason that you mention, but register_dirty_segment() doesn't look totally safe either, because it can call RegisterSyncRequest() which, in a standalone backend, can call RememberSyncRequest(). In general, it seems like it would be a lot nicer if we were doing a lot less stuff inside the critical section here. So I think you're right that we need some refactoring. Maybe smgr_prepare_truncate() and smgr_execute_truncate() or something like that. I wonder if we could actually register the dirty segment in the "prepare" phase - is it bad if we register a dirty segment before actually dirtying it? And maybe even CacheInvalidateSmgr() could be done at that stage? It seems pretty clear that dropping the dirty buffers and actually truncating the relation on disk need to happen after we've entered the critical section, because if we fail after doing the former, we've thrown away dirty data in anticipation of performing an operation that didn't happen, and if we fail when attempting the latter, primaries and standbys diverge and the originally-reported bug on this thread happens. But we'd like to move as much other stuff as we can out of that critical section. -- Robert Haas EDB: http://www.enterprisedb.com
В списке pgsql-bugs по дате отправления: