Re: long-standing data loss bug in initial sync of logical replication
От | Tomas Vondra |
---|---|
Тема | Re: long-standing data loss bug in initial sync of logical replication |
Дата | |
Msg-id | fce69fad-dccd-08e0-877e-b367e4612e95@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: long-standing data loss bug in initial sync of logical replication (Andres Freund <andres@anarazel.de>) |
Список | pgsql-hackers |
On 11/18/23 02:54, Andres Freund wrote: > Hi, > > On 2023-11-17 15:36:25 +0100, Tomas Vondra wrote: >> It seems there's a long-standing data loss issue related to the initial >> sync of tables in the built-in logical replication (publications etc.). > > :( > Yeah :-( > >> Overall, this looks, walks and quacks like a cache invalidation issue, >> likely a missing invalidation somewhere in the ALTER PUBLICATION code. > > It could also be be that pgoutput doesn't have sufficient invalidation > handling. > I'm not sure about the details, but it can't be just about pgoutput failing to react to some syscache invalidation. As described, just resetting the RelationSyncEntry doesn't fix the issue - it's the syscache that's not invalidated, IMO. But maybe that's what you mean. > > One thing that looks bogus on the DDL side is how the invalidation handling > interacts with locking. > > > For tables etc the invalidation handling works because we hold a lock on the > relation before modifying the catalog and don't release that lock until > transaction end. That part is crucial: We queue shared invalidations at > transaction commit, *after* the transaction is marked as visible, but *before* > locks are released. That guarantees that any backend processing invalidations > will see the new contents. However, if the lock on the modified object is > released before transaction commit, other backends can build and use a cache > entry that hasn't processed invalidations (invaliations are processed when > acquiring locks). > Right. > While there is such an object for publications, it seems to be acquired too > late to actually do much good in a number of paths. And not at all in others. > > E.g.: > > pubform = (Form_pg_publication) GETSTRUCT(tup); > > /* > * If the publication doesn't publish changes via the root partitioned > * table, the partition's row filter and column list will be used. So > * disallow using WHERE clause and column lists on partitioned table in > * this case. > */ > if (!pubform->puballtables && publish_via_partition_root_given && > !publish_via_partition_root) > { > /* > * Lock the publication so nobody else can do anything with it. This > * prevents concurrent alter to add partitioned table(s) with WHERE > * clause(s) and/or column lists which we don't allow when not > * publishing via root. > */ > LockDatabaseObject(PublicationRelationId, pubform->oid, 0, > AccessShareLock); > > a) Another session could have modified the publication and made puballtables out-of-date > b) The LockDatabaseObject() uses AccessShareLock, so others can get past this > point as well > > b) seems like a copy-paste bug or such? > > > I don't see any locking of the publication around RemovePublicationRelById(), > for example. > > I might just be misunderstanding things the way publication locking is > intended to work. > I've been asking similar questions while investigating this, but the interactions with logical decoding (which kinda happens concurrently in terms of WAL, but not concurrently in terms of time), historical snapshots etc. make my head spin. > >> However, while randomly poking at different things, I realized that if I >> change the lock obtained on the relation in OpenTableList() from >> ShareUpdateExclusiveLock to ShareRowExclusiveLock, the issue goes away. > > That's odd. There's cases where changing the lock level can cause invalidation > processing to happen because there is no pre-existing lock for the "new" lock > level, but there was for the old. But OpenTableList() is used when altering > the publications, so I don't see how that connects. > Yeah, I had the idea that maybe the transaction already holds the lock on the table, and changing this to ShareRowExclusiveLock makes it different, possibly triggering a new invalidation or something. But I did check with gdb, and if I set a breakpoint at OpenTableList, there are no locks on the table. But the effect is hard to deny - if I run the test 100 times, with the SharedUpdateExclusiveLock I get maybe 80 failures. After changing it to ShareRowExclusiveLock I get 0. Sure, there's some randomness for cases like this, but this is pretty unlikely. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Dilip KumarДата:
Сообщение: Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock