Re: long-standing data loss bug in initial sync of logical replication

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: long-standing data loss bug in initial sync of logical replication
Дата
Msg-id fce69fad-dccd-08e0-877e-b367e4612e95@enterprisedb.com
обсуждение исходный текст
Ответ на Re: long-standing data loss bug in initial sync of logical replication  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On 11/18/23 02:54, Andres Freund wrote:
> Hi,
> 
> On 2023-11-17 15:36:25 +0100, Tomas Vondra wrote:
>> It seems there's a long-standing data loss issue related to the initial
>> sync of tables in the built-in logical replication (publications etc.).
> 
> :(
> 

Yeah :-(

> 
>> Overall, this looks, walks and quacks like a cache invalidation issue,
>> likely a missing invalidation somewhere in the ALTER PUBLICATION code.
> 
> It could also be be that pgoutput doesn't have sufficient invalidation
> handling.
> 

I'm not sure about the details, but it can't be just about pgoutput
failing to react to some syscache invalidation. As described, just
resetting the RelationSyncEntry doesn't fix the issue - it's the
syscache that's not invalidated, IMO. But maybe that's what you mean.

> 
> One thing that looks bogus on the DDL side is how the invalidation handling
> interacts with locking.
> 
> 
> For tables etc the invalidation handling works because we hold a lock on the
> relation before modifying the catalog and don't release that lock until
> transaction end. That part is crucial: We queue shared invalidations at
> transaction commit, *after* the transaction is marked as visible, but *before*
> locks are released. That guarantees that any backend processing invalidations
> will see the new contents.  However, if the lock on the modified object is
> released before transaction commit, other backends can build and use a cache
> entry that hasn't processed invalidations (invaliations are processed when
> acquiring locks).
> 

Right.

> While there is such an object for publications, it seems to be acquired too
> late to actually do much good in a number of paths. And not at all in others.
> 
> E.g.:
> 
>     pubform = (Form_pg_publication) GETSTRUCT(tup);
> 
>     /*
>      * If the publication doesn't publish changes via the root partitioned
>      * table, the partition's row filter and column list will be used. So
>      * disallow using WHERE clause and column lists on partitioned table in
>      * this case.
>      */
>     if (!pubform->puballtables && publish_via_partition_root_given &&
>         !publish_via_partition_root)
>         {
>         /*
>          * Lock the publication so nobody else can do anything with it. This
>          * prevents concurrent alter to add partitioned table(s) with WHERE
>          * clause(s) and/or column lists which we don't allow when not
>          * publishing via root.
>          */
>         LockDatabaseObject(PublicationRelationId, pubform->oid, 0,
>                            AccessShareLock);
> 
> a) Another session could have modified the publication and made puballtables out-of-date
> b) The LockDatabaseObject() uses AccessShareLock, so others can get past this
>    point as well
> 
> b) seems like a copy-paste bug or such?
> 
> 
> I don't see any locking of the publication around RemovePublicationRelById(),
> for example.
> 
> I might just be misunderstanding things the way publication locking is
> intended to work.
> 

I've been asking similar questions while investigating this, but the
interactions with logical decoding (which kinda happens concurrently in
terms of WAL, but not concurrently in terms of time), historical
snapshots etc. make my head spin.

> 
>> However, while randomly poking at different things, I realized that if I
>> change the lock obtained on the relation in OpenTableList() from
>> ShareUpdateExclusiveLock to ShareRowExclusiveLock, the issue goes away.
> 
> That's odd. There's cases where changing the lock level can cause invalidation
> processing to happen because there is no pre-existing lock for the "new" lock
> level, but there was for the old. But OpenTableList() is used when altering
> the publications, so I don't see how that connects.
> 

Yeah, I had the idea that maybe the transaction already holds the lock
on the table, and changing this to ShareRowExclusiveLock makes it
different, possibly triggering a new invalidation or something. But I
did check with gdb, and if I set a breakpoint at OpenTableList, there
are no locks on the table.

But the effect is hard to deny - if I run the test 100 times, with the
SharedUpdateExclusiveLock I get maybe 80 failures. After changing it to
ShareRowExclusiveLock I get 0. Sure, there's some randomness for cases
like this, but this is pretty unlikely.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Synchronizing slots from primary to standby