Re: logical decoding and replication of sequences, take 2

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: logical decoding and replication of sequences, take 2
Дата
Msg-id 591c59af-6254-127c-3de0-6ef4445ada05@enterprisedb.com
обсуждение исходный текст
Ответ на Re: logical decoding and replication of sequences, take 2  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Ответы Re: logical decoding and replication of sequences, take 2  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Re: logical decoding and replication of sequences, take 2  (Amit Kapila <amit.kapila16@gmail.com>)
Re: logical decoding and replication of sequences, take 2  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Re: logical decoding and replication of sequences, take 2  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Список pgsql-hackers
On 7/20/23 09:24, Ashutosh Bapat wrote:
> Thanks Tomas for the updated patches.
> 
> Here are my comments on 0006 patch as well as 0002 patch.
> 
> On Wed, Jul 19, 2023 at 4:23 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> I think this is an accurate description of what the current patch does.
>> And I think it's a reasonable behavior.
>>
>> My point is that if this gets released in PG17, it'll be difficult to
>> change, even if it does not change on-disk format.
>>
> 
> Yes. I agree. And I don't see any problem even if we are not able to change it.
> 
>>
>> I need to think behavior about this a bit more, and maybe check how
>> difficult would be implementing it.
> 
> Ok.
> 
> In most of the comments and in documentation, there are some phrases
> which do not look accurate.
> 
> Change to a sequence is being refered to as "sequence increment". While
> ascending sequences are common, PostgreSQL supports descending sequences as
> well. The changes there will be decrements. But that's not the only case. A
> sequence may be restarted with an older value, in which case the change could
> increment or a decrement. I think correct usage is 'changes to sequence' or
> 'sequence changes'.
> 
> Sequence being assigned a new relfilenode is referred to as sequence
> being created. This is confusing. When an existing sequence is ALTERed, we
> will not "create" a new sequence but we will "create" a new relfilenode and
> "assign" it to that sequence.
> 
> PFA such edits in 0002 and 0006 patches. Let me know if those look
> correct. I think we
> need similar changes to the documentation and comments in other places.
> 

OK, I merged the changes into the patches, with some minor changes to
the wording etc.

>>
>> I did however look at the proposed alternative to the "created" flag.
>> The attached 0006 part ditches the flag with XLOG_SMGR_CREATE decoding.
>> The smgr_decode code needs a review (I'm not sure the
>> skipping/fast-forwarding part is correct), but it seems to be working
>> fine overall, although we need to ensure the WAL record has the correct XID.
>>
> 
> Briefly describing the patch. When decoding a XLOG_SMGR_CREATE WAL
> record, it adds the relfilenode mentioned in it to the sequences hash.
> When decoding a sequence change record, it checks whether the
> relfilenode in the WAL record is in hash table. If it is the sequence
> changes is deemed transactional otherwise non-transactional. The
> change looks good to me. It simplifies the logic to decide whether a
> sequence change is transactional or not.
> 

Right.

> In sequence_decode() we skip sequence changes when fast forwarding.
> Given that smgr_decode() is only to supplement sequence_decode(), I
> think it's correct to do the same in smgr_decode() as well. Simillarly
> skipping when we don't have full snapshot.
> 

I don't follow, smgr_decode already checks ctx->fast_forward.

> Some minor comments on 0006 patch
> 
> +    /* make sure the relfilenode creation is associated with the XID */
> +    if (XLogLogicalInfoActive())
> +        GetCurrentTransactionId();
> 
> I think this change is correct and is inline with similar changes in 0002. But
> I looked at other places from where DefineRelation() is called. For regular
> tables it is called from ProcessUtilitySlow() which in turn does not call
> GetCurrentTransactionId(). I am wondering whether we are just discovering a
> class of bugs caused by not associating an xid with a newly created
> relfilenode.
> 

Not sure. Why would it be a bug?

> +    /*
> +     * If we don't have snapshot or we are just fast-forwarding, there is no
> +     * point in decoding changes.
> +     */
> +    if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
> +        ctx->fast_forward)
> +        return;
> 
> This code block is repeated.
> 

Fixed.

> +void
> +ReorderBufferAddRelFileLocator(ReorderBuffer *rb, TransactionId xid,
> +                               RelFileLocator rlocator)
> +{
>     ... snip ...
> +
> +    /* sequence changes require a transaction */
> +    if (xid == InvalidTransactionId)
> +        return;
> 
> IIUC, with your changes in DefineSequence() in this patch, this should not
> happen. So this condition will never be true. But in case it happens, this code
> will not add the relfilelocation to the hash table and we will deem the
> sequence change as non-transactional. Isn't it better to just throw an error
> and stop replication if that (ever) happens?
> 

It can't happen for sequence, but it may happen when creating a
non-sequence relfilenode. In a way, it's a way to skip (some)
unnecessary relfilenodes.

> Also some comments on 0002 patch
> 
> @@ -405,8 +405,19 @@ fill_seq_fork_with_data(Relation rel, HeapTuple
> tuple, ForkNumber forkNum)
> 
>      /* check the comment above nextval_internal()'s equivalent call. */
>      if (RelationNeedsWAL(rel))
> +    {
>          GetTopTransactionId();
> 
> +        /*
> +         * Make sure the subtransaction has a XID assigned, so that
> the sequence
> +         * increment WAL record is properly associated with it. This
> matters for
> +         * increments of sequences created/altered in the
> transaction, which are
> +         * handled as transactional.
> +         */
> +        if (XLogLogicalInfoActive())
> +            GetCurrentTransactionId();
> +    }
> +
> 
> I think we should separately commit the changes which add a call to
> GetCurrentTransactionId(). That looks like an existing bug/anomaly
> which can stay irrespective of this patch.
> 

Not sure, but I don't see this as a bug.

> +    /*
> +     * To support logical decoding of sequences, we require the sequence
> +     * callback. We decide it here, but only check it later in the wrappers.
> +     *
> +     * XXX Isn't it wrong to define only one of those callbacks? Say we
> +     * only define the stream_sequence_cb() - that may get strange results
> +     * depending on what gets streamed. Either none or both?
> +     *
> +     * XXX Shouldn't sequence be defined at slot creation time, similar
> +     * to two_phase? Probably not.
> +     */
> 
> Do you intend to keep these XXX's as is? My previous comments on this comment
> block are in [1].
> 
> In fact, given that whether or not sequences are replicated is decided by the
> protocol version, do we really need LogicalDecodingContext::sequences? Drawing
> parallel with WAL messages, I don't think it's needed.
> 

Right. We do that for two_phase because you can override that when
creating the subscription - sequences allowed that too initially, but
then we ditched that. So I don't think we need this.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt"
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Printing backtrace of postgres processes