Re: POC: Cleaning up orphaned files using undo logs
От | Antonin Houska |
---|---|
Тема | Re: POC: Cleaning up orphaned files using undo logs |
Дата | |
Msg-id | 16128.1606382901@antos обсуждение исходный текст |
Ответ на | Re: POC: Cleaning up orphaned files using undo logs (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
Amit Kapila <amit.kapila16@gmail.com> wrote: > On Wed, Nov 25, 2020 at 7:47 PM Antonin Houska <ah@cybertec.at> wrote: > > > > Antonin Houska <ah@cybertec.at> wrote: > > > > > Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > I think we also need to maintain oldestXidHavingUndo for CLOG truncation and > > > > transaction-wraparound. We can't allow CLOG truncation for the transaction > > > > whose undo is not discarded as that could be required by some other > > > > transaction. > > > > > > Good point. Even the discard worker might need to check the transaction status > > > when deciding whether undo log of that transaction should be discarded. > > > > In the zheap code [1] I see that DiscardWorkerMain() discards undo log up to > > OldestXmin: > > > > > > OldestXmin = GetOldestXmin(NULL, PROCARRAY_FLAGS_AUTOVACUUM | > > PROCARRAY_FLAGS_VACUUM); > > > > oldestXidHavingUndo = GetXidFromEpochXid(pg_atomic_read_u64(&ProcGlobal->oldestXidWithEpochHavingUndo)); > > > > /* > > * Call the discard routine if there oldestXidHavingUndo is lagging > > * behind OldestXmin. > > */ > > if (OldestXmin != InvalidTransactionId && > > TransactionIdPrecedes(oldestXidHavingUndo, OldestXmin)) > > { > > UndoDiscard(OldestXmin, &hibernate); > > > > and that UndoDiscard() eventually advances oldestXidHavingUndo in the shared > > memory. > > > > I'm not sure this is correct because, IMO, OldestXmin can advance as soon as > > AbortTransaction() has cleared both xid and xmin fields of the transaction's > > PGXACT (by calling ProcArrayEndTransactionInternal). However the corresponding > > undo log may still be waiting for processing. Am I wrong? > The UndoDiscard->UndoDiscardOneLog ensures that we don't discard the > undo if there is a pending abort. ok, I should have dug deeper than just reading the header comment of UndoDiscard(). Checked now and seem to understand why no information is lost. Nevertheless, I see in the zheap code that the discard worker may need to scan a lot of undo log each time. While the oldest_xid and oldest_data fields of UndoLogControl help to skip parts of the log, I'm not sure such information fits into the undo-record-set (URS) approach. For now I tend to try to implement the "exhaustive" scan for the URS too, and later let's teach the discard worker to store some metadata so that the processing is rather incremental. > > I think that oldestXidHavingUndo should be advanced at the time transaction > > commits or when the undo log of an aborted transaction has been applied. > > > > We can't advance oldestXidHavingUndo just on commit because later we > need to rely on it for visibility, basically any transaction older > than oldestXidHavingUndo should be all-visible. ok -- Antonin Houska Web: https://www.cybertec-postgresql.com
В списке pgsql-hackers по дате отправления: