Re: Inval reliability, especially for inplace updates
От | Noah Misch |
---|---|
Тема | Re: Inval reliability, especially for inplace updates |
Дата | |
Msg-id | 20240615223718.42.nmisch@google.com обсуждение исходный текст |
Ответ на | Inval reliability, especially for inplace updates (Noah Misch <noah@leadboat.com>) |
Ответы |
Re: Inval reliability, especially for inplace updates
Re: Inval reliability, especially for inplace updates |
Список | pgsql-hackers |
On Wed, May 22, 2024 at 05:05:48PM -0700, Noah Misch wrote: > https://postgr.es/m/20240512232923.aa.nmisch@google.com wrote: > > Separable, nontrivial things not fixed in the attached patch stack: > > > > - Inplace update uses transactional CacheInvalidateHeapTuple(). ROLLBACK of > > CREATE INDEX wrongly discards the inval, leading to the relhasindex=t loss > > still seen in inplace-inval.spec. CacheInvalidateRelmap() does this right. > > I plan to fix that like CacheInvalidateRelmap(): send the inval immediately, > inside the critical section. Send it in heap_xlog_inplace(), too. > a. Within logical decoding, cease processing invalidations for inplace I'm attaching the implementation. This applies atop the v3 patch stack from https://postgr.es/m/20240614003549.c2.nmisch@google.com, but the threads are mostly orthogonal and intended for independent review. Translating a tuple into inval messages uses more infrastructure than relmapper, which needs just a database ID. Hence, this ended up more like a miniature of inval.c's participation in the transaction commit sequence. I waffled on whether to back-patch inplace150-inval-durability-atcommit. The consequences of that bug are plenty bad, but reaching them requires an error between TransactionIdCommitTree() and AtEOXact_Inval(). I've not heard reports of that, and I don't have a recipe for making it happen on demand. For now, I'm leaning toward back-patch. The main risk would be me overlooking an LWLock deadlock scenario reachable from the new, earlier RelCacheInitLock timing. Alternatives for RelCacheInitLock: - RelCacheInitLock before PreCommit_Notify(), because notify concurrency matters more than init file concurrency. I chose this. - RelCacheInitLock after PreCommit_Notify(), because PreCommit_Notify() uses a heavyweight lock, giving it less risk of undetected deadlock. - Replace RelCacheInitLock with a heavyweight lock, and keep it before PreCommit_Notify(). - Fold PreCommit_Inval() back into AtCommit_Inval(), accepting that EIO in unlink_initfile() will PANIC. Opinions on that? The patch changes xl_heap_inplace of XLOG_HEAP_INPLACE. For back branches, we could choose between: - Same change, no WAL version bump. Standby must update before primary. This is best long-term, but the transition is more disruptive. I'm leaning toward this one, but the second option isn't bad: - heap_xlog_inplace() could set the shared-inval-queue overflow signal on every backend. This is more wasteful, but inplace updates might be rare enough (~once per VACUUM) to make it tolerable. - Use LogStandbyInvalidations() just after XLOG_HEAP_INPLACE. This isn't correct if one ends recovery between the two records, but you'd need to be unlucky to notice. Noticing would need a procedure like the following. A hot standby backend populates a relcache entry, then does DDL on the rel after recovery ends. Future cleanup work could eliminate LogStandbyInvalidations() and the case of !markXidCommitted && nmsgs != 0. Currently, the src/test/regress suite still reaches that case: - AlterDomainDropConstraint() queues an inval even if !found; it can stop that. - ON COMMIT DELETE ROWS nontransactionally rebuilds an index, which sends a relcache inval. The point of that inval is, I think, to force access methods like btree and hash to reload the metapage copy that they store in rd_amcache. Since no assigned XID implies no changes to the temp index, the no-XID case could simply skip the index rebuild. (temp.sql reaches this with a read-only transaction that selects from an ON COMMIT DELETE ROWS table. Realistic usage will tend not to do that.) ON COMMIT DELETE ROWS has another preexisting problem for indexes, mentioned in a code comment. Thanks, nm
Вложения
В списке pgsql-hackers по дате отправления: