Re: Minimal logical decoding on standbys
От | Drouvot, Bertrand |
---|---|
Тема | Re: Minimal logical decoding on standbys |
Дата | |
Msg-id | 21b700c3-eecf-2e05-a699-f8c78dd31ec7@gmail.com обсуждение исходный текст |
Ответ на | Re: Minimal logical decoding on standbys (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Minimal logical decoding on standbys
Re: Minimal logical decoding on standbys |
Список | pgsql-hackers |
Hi, On 3/31/23 6:33 AM, Andres Freund wrote: > Hi, > > On 2023-03-30 18:23:41 +0200, Drouvot, Bertrand wrote: >> On 3/30/23 9:04 AM, Andres Freund wrote: >>> I think this commit is ready to go. Unless somebody thinks differently, I >>> think I might push it tomorrow. >> >> Great! Once done, I'll submit a new patch so that GlobalVisTestFor() can make >> use of the heap relation in vacuumRedirectAndPlaceholder() (which will be possible >> once 0001 is committed). > > Unfortunately I did find an issue doing a pre-commit review of the patch. > > The patch adds VISIBILITYMAP_IS_CATALOG_REL to xl_heap_visible.flags - but it > does not remove the bit before calling visibilitymap_set(). > > This ends up corrupting the visibilitymap, because the we'll set a bit for > another page. > Oh I see, I did not think about that (not enough experience in the VM area). Nice catch and thanks for pointing out! > On a casual read, one very well might think that VISIBILITYMAP_IS_CATALOG_REL > is a valid bit that could be set in the VM. > I see what you're saying now and do agree that's confusing. > I am thinking of instead creating a separate namespace for the "xlog only" > bits: > > /* > * To detect recovery conflicts during logical decoding on a standby, we need > * to know if a table is a user catalog table. For that we add an additional > * bit into xl_heap_visible.flags, in addition to the above. > * > * NB: VISIBILITYMAP_XLOG_* may not be passed to visibilitymap_set(). > */ > #define VISIBILITYMAP_XLOG_CATALOG_REL 0x04 > #define VISIBILITYMAP_XLOG_VALID_BITS (VISIBILITYMAP_VALID_BITS | VISIBILITYMAP_XLOG_CATALOG_REL) > > > That allows heap_xlog_visible() to do: > > Assert((xlrec->flags & VISIBILITYMAP_XLOG_VALID_BITS) == xlrec->flags); > vmbits = (xlrec->flags & VISIBILITYMAP_VALID_BITS); > > and pass vmbits istead of xlrec->flags to visibilitymap_set(). > That sounds good to me. That way you'd ensure that VISIBILITYMAP_XLOG_CATALOG_REL is not passed to visibilitymap_set(). > > I'm also thinking of splitting the patch into two. One patch to pass down the > heap relation into the new places, and another for the rest. I think that makes sense. I don't know how far you've work on the split but please find attached V54 doing such a split + implementing your VISIBILITYMAP_XLOG_VALID_BITS suggestion. > > Note that gistXLogDelete() continues to register data with two different > XLogRegisterData() calls. This will append data without any padding: > > XLogRegisterData((char *) &xlrec, SizeOfGistxlogDelete); > > /* > * We need the target-offsets array whether or not we store the whole > * buffer, to allow us to find the snapshotConflictHorizon on a standby > * server. > */ > XLogRegisterData((char *) todelete, ntodelete * sizeof(OffsetNumber)); > > > But replay now uses the new offset member: > >> @@ -177,6 +177,7 @@ gistRedoDeleteRecord(XLogReaderState *record) >> gistxlogDelete *xldata = (gistxlogDelete *) XLogRecGetData(record); >> Buffer buffer; >> Page page; >> + OffsetNumber *toDelete = xldata->offsets; >> >> /* >> * If we have any conflict processing to do, it must happen before we > > > That doesn't look right. If there's any padding before offsets, we'll afaict > read completely bogus data? > > As it turns out, there is padding: > > struct gistxlogDelete { > TransactionId snapshotConflictHorizon; /* 0 4 */ > uint16 ntodelete; /* 4 2 */ > _Bool isCatalogRel; /* 6 1 */ > > /* XXX 1 byte hole, try to pack */ > > OffsetNumber offsets[]; /* 8 0 */ > > /* size: 8, cachelines: 1, members: 4 */ > /* sum members: 7, holes: 1, sum holes: 1 */ > /* last cacheline: 8 bytes */ > }; > > > I am frankly baffled how this works at all, this should just about immediately > crash? > > Oh, I see. Hm, don't we have already the same issue for spgxlogVacuumRoot / vacuumLeafRoot() / spgRedoVacuumRoot()? > > I'm not going to commit a nontrivial change to these WAL records without some > minimal tests. > That makes fully sense. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Вложения
- v54-0007-Doc-changes-describing-details-about-logical-dec.patch
- v54-0006-New-TAP-test-for-logical-decoding-on-standby.patch
- v54-0005-Fixing-Walsender-corner-case-with-logical-decodi.patch
- v54-0004-Allow-logical-decoding-on-standby.patch
- v54-0003-Handle-logical-slot-conflicts-on-standby.patch
- v54-0002-Add-info-in-WAL-records-in-preparation-for-logic.patch
- v54-0001-Pass-down-the-heap-relation-into-the-new-places.patch
В списке pgsql-hackers по дате отправления: