Re: How can end users know the cause of LR slot sync delays?
| От | shveta malik | 
|---|---|
| Тема | Re: How can end users know the cause of LR slot sync delays? | 
| Дата | |
| Msg-id | CAJpy0uC9WsJhc-qeFmz_JTPjW1vH3Zm+zS6jX0PKTY6vtEp38w@mail.gmail.com обсуждение исходный текст  | 
		
| Ответ на | Re: How can end users know the cause of LR slot sync delays? (shveta malik <shveta.malik@gmail.com>) | 
| Список | pgsql-hackers | 
On Mon, Nov 3, 2025 at 3:14 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > I have attached the latest patch. > > > > Thanks. I have started going through it. > > I’m not sure if this has already been discussed; I couldn’t find any > mention of it in the thread. Why don’t we persist > 'slot_sync_skip_reason' (it is outside of > ReplicationSlotPersistentData)? If a slot wasn’t synced during the > last cycle and the server restarts, it would be helpful to know the > reason it wasn’t synced prior to the node restart. > Please find a few more comments: 1) last_slot_sync_skip Will 'last_slotsync_skip_at' be a better name? Since we refer worker as slotsync in docs, I feel slotsync seems a more natural choice than slot_sync. Also '_at' gives clarity that it is about time rather than a boolean (which currently it seems like). Same goes for slot_sync_skip_count and slot_sync_skip_reason. Shall these be slotsync_skip_count and slotsync_skip_reason. 2) +update_slot_sync_skip_stats(ReplicationSlot *slot, SlotSyncSkipReason skip_reason, + bool acquire_slot) It looks like there is only one caller that passes acquire_slot as true, while all others pass false. Instead of keeping the acquire_slot parameter, will it be better if we remove it and add an Assert(MyReplication) to ensure the slot is already acquired? We can add a comment stating that this function expects the slot to be acquired by the caller. The one caller that currently passes acquire_slot = true can acquire the slot explicitly before invoking this function. Thoughts? 3) In update_and_persist_local_synced_slot(), we get both 'found_consistent_snapshot' and 'remote_slot_precedes' from update_local_synced_slot(). But skipsync-reason for 'remote_slot_precedes' is updated inside update_local_synced_slot() while skipsync-reason for '!found_consistent_snapshot' is updated in caller update_and_persist_local_synced_slot. Is there a reason for that? 4) What about the case where the slot is invalidated and sync is skipped? I do not see any stats for that. See 'Skip the sync of an invalidated slot' in synchronize_one_slot(). If it is already discussed and concluded, please add a comment. thanks Shveta
В списке pgsql-hackers по дате отправления: