Re: logical decoding and replication of sequences, take 2
От | Tomas Vondra |
---|---|
Тема | Re: logical decoding and replication of sequences, take 2 |
Дата | |
Msg-id | 5e986232-7b76-c112-facd-c712f52e62b4@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: logical decoding and replication of sequences, take 2 (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>) |
Ответы |
Re: logical decoding and replication of sequences, take 2
Re: logical decoding and replication of sequences, take 2 |
Список | pgsql-hackers |
On 7/19/23 07:42, Ashutosh Bapat wrote: > On Wed, Jul 19, 2023 at 1:20 AM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >>>> >>> >>> This behaviour doesn't need any on-disk changes or has nothing in it >>> which prohibits us from changing it in future. So I think it's good as >>> a v0. If required we can add the protocol option to provide more >>> flexible behaviour. >>> >> >> True, although "no on-disk changes" does not exactly mean we can just >> change it at will. Essentially, once it gets released, the behavior is >> somewhat fixed for the next ~5 years, until that release gets EOL. And >> likely longer, because more features are likely to do the same thing. >> >> That's essentially why the patch was reverted from PG16 - I was worried >> the elaborate protocol versioning/negotiation was not the right thing. > > I agree that elaborate protocol would pose roadblocks in future. It's > better not to add that burden right now, esp. when usage is not clear. > > Here's behavriour and extension matrix as I understand it and as of > the last set of patches. > > Publisher PG 17, Subscriber PG 17 - changes to sequences are > replicated, downstream is capable of applying them > > Publisher PG 16-, Subscriber PG 17 changes to sequences are never replicated > > Publisher PG 18+, Subscriber PG 17 - same as 17, 17 case. Any changes > in PG 18+ need to make sure that PG 17 subscriber receives sequence > changes irrespective of changes in protocol. That may pose some > maintenance burden but doesn't seem to be any harder than usual > backward compatibility burden. > > Moreover users can control whether changes to sequences get replicated > or not by controlling the objects contained in publication. > > I don't see any downside to this. Looks all good. Please correct me if wrong. > I think this is an accurate description of what the current patch does. And I think it's a reasonable behavior. My point is that if this gets released in PG17, it'll be difficult to change, even if it does not change on-disk format. >> >>> One thing I am worried about is that the subscriber will get an error >>> only when a sequence change is decoded. All the prior changes will be >>> replicated and applied on the subscriber. Thus by the time the user >>> realises this mistake, they may have replicated data. At this point if >>> they want to subscribe to a publication without sequences they will >>> need to clean the already replicated data. But they may not be in a >>> position to know which is which esp when the subscriber has its own >>> data in those tables. Example, >>> >>> publisher: create publication pub with sequences and tables >>> subscriber: subscribe to pub >>> publisher: modify data in tables and sequences >>> subscriber: replicates some data and errors out >>> publisher: delete some data from tables >>> publisher: create a publication pub_tab without sequences >>> subscriber: subscribe to pub_tab >>> subscriber: replicates the data but rows which were deleted on >>> publisher remain on the subscriber >>> >> >> Sure, but I'd argue that's correct. If the replication stream has >> something the subscriber can't apply, what else would you do? We had >> exactly the same thing with TRUNCATE, for example (except that it failed >> with "unknown message" on the subscriber). > > When the replication starts, the publisher knows what publication is > being used, it also knows what protocol is being used. From > publication it knows what objects will be replicated. So we could fail > before any changes are replicated when executing START_REPLICATION > command. According to [1], if an object is added or removed from > publication the subscriber is required to REFRESH SUBSCRIPTION in > which case there will be fresh START_REPLICATION command sent. So we > should fail the START_REPLICATION command before sending any change > rather than when a change is being replicated. That's more > deterministic and easy to handle. Of course any changes that were sent > before ALTER PUBLICATION can not be reverted, but that's expected. > > Coming back to TRUNCATE, I don't think it's possible to know whether a > publication will send a truncate downstream or not. So we can't throw > an error before TRUNCATE change is decoded. > > Anyway, I think this behaviour should be documented. I didn't see this > mentioned in PUBLICATION or SUBSCRIPTION documentation. > I need to think behavior about this a bit more, and maybe check how difficult would be implementing it. I did however look at the proposed alternative to the "created" flag. The attached 0006 part ditches the flag with XLOG_SMGR_CREATE decoding. The smgr_decode code needs a review (I'm not sure the skipping/fast-forwarding part is correct), but it seems to be working fine overall, although we need to ensure the WAL record has the correct XID. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
- 0001-Make-test_decoding-ddl.out-shorter-20230719.patch
- 0002-Logical-decoding-of-sequences-20230719.patch
- 0003-Add-decoding-of-sequences-to-test_decoding-20230719.patch
- 0004-Add-decoding-of-sequences-to-built-in-repli-20230719.patch
- 0005-Simplify-protocol-versioning-20230719.patch
- 0006-replace-created-flag-with-XLOG_SMGR_CREATE-20230719.patch
В списке pgsql-hackers по дате отправления: