Обсуждение: How can end users know the cause of LR slot sync delays?
Hi,
We have seen cases where slot synchronization gets delayed, for example when the slot is behind the failover standby or vice versa, and the slot sync worker has to wait for one to catch up with the other. During this waiting period, users querying pg_replication_slots can only see whether the slot has been synchronized or not. If it has already synchronized, that’s fine, but if synchronization is taking longer, users would naturally want to understand the reason for the delay.
Is there a way for end users to know the cause of slot synchronization delays, so they can take appropriate actions to speed it up?
I understand that server logs are emitted in such cases, but logs are not something end users would want to check regularly. Moreover, since logging is configuration-based, relevant messages may sometimes be skipped or suppressed.
Thanks & Regards,
Ashutosh Sharma.
Is there a way for end users to know the cause of slot synchronization delays, so they can take appropriate actions to speed it up?
I understand that server logs are emitted in such cases, but logs are not something end users would want to check regularly. Moreover, since logging is configuration-based, relevant messages may sometimes be skipped or suppressed.
Thanks & Regards,
Ashutosh Sharma.
On Thu, Aug 28, 2025 at 11:07 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > We have seen cases where slot synchronization gets delayed, for example when the slot is behind the failover standby orvice versa, and the slot sync worker has to wait for one to catch up with the other. During this waiting period, usersquerying pg_replication_slots can only see whether the slot has been synchronized or not. If it has already synchronized,that’s fine, but if synchronization is taking longer, users would naturally want to understand the reason forthe delay. > > Is there a way for end users to know the cause of slot synchronization delays, so they can take appropriate actions tospeed it up? > > I understand that server logs are emitted in such cases, but logs are not something end users would want to check regularly.Moreover, since logging is configuration-based, relevant messages may sometimes be skipped or suppressed. > Currently, the way to see the reason for sync skip is LOGs but I think it is better to add a new column like sync_skip_reason in pg_replication_slots. This can show the reasons like standby_LSN_ahead_remote_LSN. I think ideally users can compare standby's slot LSN/XMIN with remote_slot being synced. Do you have any better ideas? -- With Regards, Amit Kapila.
On Thu, 28 Aug 2025 at 14:56, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Aug 28, 2025 at 11:07 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > We have seen cases where slot synchronization gets delayed, for example when the slot is behind the failover standbyor vice versa, and the slot sync worker has to wait for one to catch up with the other. During this waiting period,users querying pg_replication_slots can only see whether the slot has been synchronized or not. If it has alreadysynchronized, that’s fine, but if synchronization is taking longer, users would naturally want to understand the reasonfor the delay. > > > > Is there a way for end users to know the cause of slot synchronization delays, so they can take appropriate actions tospeed it up? > > > > I understand that server logs are emitted in such cases, but logs are not something end users would want to check regularly.Moreover, since logging is configuration-based, relevant messages may sometimes be skipped or suppressed. > > > > Currently, the way to see the reason for sync skip is LOGs but I think > it is better to add a new column like sync_skip_reason in > pg_replication_slots. This can show the reasons like > standby_LSN_ahead_remote_LSN. I think ideally users can compare > standby's slot LSN/XMIN with remote_slot being synced. Do you have any > better ideas? > How about something like pg_stat_progress_replication_slot with remote LSN/standby LSN/catalog XID etc? Wouldn't this be in sync with all other debug pg_stat_progress* views and thus more Postgres-y? -- Best regards, Kirill Reshke
On Thu, Aug 28, 2025 at 3:29 PM Kirill Reshke <reshkekirill@gmail.com> wrote: > > On Thu, 28 Aug 2025 at 14:56, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Aug 28, 2025 at 11:07 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > > > We have seen cases where slot synchronization gets delayed, for example when the slot is behind the failover standbyor vice versa, and the slot sync worker has to wait for one to catch up with the other. During this waiting period,users querying pg_replication_slots can only see whether the slot has been synchronized or not. If it has alreadysynchronized, that’s fine, but if synchronization is taking longer, users would naturally want to understand the reasonfor the delay. > > > > > > Is there a way for end users to know the cause of slot synchronization delays, so they can take appropriate actionsto speed it up? > > > > > > I understand that server logs are emitted in such cases, but logs are not something end users would want to check regularly.Moreover, since logging is configuration-based, relevant messages may sometimes be skipped or suppressed. > > > > > > > Currently, the way to see the reason for sync skip is LOGs but I think > > it is better to add a new column like sync_skip_reason in > > pg_replication_slots. This can show the reasons like > > standby_LSN_ahead_remote_LSN. I think ideally users can compare > > standby's slot LSN/XMIN with remote_slot being synced. Do you have any > > better ideas? > > > > How about something like pg_stat_progress_replication_slot with remote > LSN/standby LSN/catalog XID etc? > Wouldn't this be in sync with all other debug pg_stat_progress* views > and thus more Postgres-y? > Yes, that is another option. I am a little worried that it is not always the sync lags behind, so having a separate view just for sync progress may be too much. Yet another option is existing view pg_stat_replication_slots but it seems sync progress doesn't directly match there. For example, we can add a counter sync_skipped, time of last sync_skip, and last_sync_skip_reason that could be sufficient to dig the problem further. -- With Regards, Amit Kapila.
Hi Amit,
On Thu, Aug 28, 2025 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Thu, Aug 28, 2025 at 11:07 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> We have seen cases where slot synchronization gets delayed, for example when the slot is behind the failover standby or vice versa, and the slot sync worker has to wait for one to catch up with the other. During this waiting period, users querying pg_replication_slots can only see whether the slot has been synchronized or not. If it has already synchronized, that’s fine, but if synchronization is taking longer, users would naturally want to understand the reason for the delay.
>
> Is there a way for end users to know the cause of slot synchronization delays, so they can take appropriate actions to speed it up?
>
> I understand that server logs are emitted in such cases, but logs are not something end users would want to check regularly. Moreover, since logging is configuration-based, relevant messages may sometimes be skipped or suppressed.
>
Currently, the way to see the reason for sync skip is LOGs but I think
it is better to add a new column like sync_skip_reason in
pg_replication_slots. This can show the reasons like
standby_LSN_ahead_remote_LSN. I think ideally users can compare
standby's slot LSN/XMIN with remote_slot being synced. Do you have any
better ideas?
Step 1: Define an enum for all possible reasons a slot persistence was skipped.
/*
* Reasons why a replication slot sync was skipped.
*/
typedef enum ReplicationSlotSyncSkipReason
{
RS_SYNC_SKIP_NONE = 0, /* No skip */
RS_SYNC_SKIP_REMOTE_BEHIND = (1 << 0), /* Remote slot is behind local reserved LSN */
RS_SYNC_SKIP_DATA_LOSS = (1 << 1), /* Local slot ahead of remote, risk of data loss */
RS_SYNC_SKIP_NO_SNAPSHOT = (1 << 2) /* Standby could not build a consistent snapshot */
} ReplicationSlotSyncSkipReason;
/*
* Reasons why a replication slot sync was skipped.
*/
typedef enum ReplicationSlotSyncSkipReason
{
RS_SYNC_SKIP_NONE = 0, /* No skip */
RS_SYNC_SKIP_REMOTE_BEHIND = (1 << 0), /* Remote slot is behind local reserved LSN */
RS_SYNC_SKIP_DATA_LOSS = (1 << 1), /* Local slot ahead of remote, risk of data loss */
RS_SYNC_SKIP_NO_SNAPSHOT = (1 << 2) /* Standby could not build a consistent snapshot */
} ReplicationSlotSyncSkipReason;
--
Step 2: Introduce new column to pg_replication_slots to store the skip reason
/* Inside pg_replication_slots table */
ReplicationSlotSyncSkipReason slot_sync_skip_reason;
--
Step 3: Function to convert enum to human-readable string that can be stored in pg_replication_slots.
/*
* Convert ReplicationSlotSyncSkipReason bitmask to human-readable string.
*
* Returns a palloc'd string; caller is responsible for freeing it.
*/
static char *
replication_slot_sync_skip_reason_str(ReplicationSlotSyncSkipReason reason)
{
StringInfoData buf;
initStringInfo(&buf);
if (reason == RS_SYNC_SKIP_NONE)
{
appendStringInfoString(&buf, "none");
return buf.data;
}
if (reason & RS_SYNC_SKIP_REMOTE_BEHIND)
appendStringInfoString(&buf, "remote_behind|");
if (reason & RS_SYNC_SKIP_DATA_LOSS)
appendStringInfoString(&buf, "data_loss|");
if (reason & RS_SYNC_SKIP_NO_SNAPSHOT)
appendStringInfoString(&buf, "no_snapshot|");
/* Remove trailing '|' */
if (buf.len > 0 && buf.data[buf.len - 1] == '|')
buf.data[buf.len - 1] = '\0';
return buf.data;
}
--
Step 4: Capture slot_sync_skip_reason whenever the relevant LOG messages are generated, primarily inside update_local_synced_slot or update_and_persist_local_synced_slot. This value will can later be persisted in the pg_replication_slots catalog.
Step 2: Introduce new column to pg_replication_slots to store the skip reason
/* Inside pg_replication_slots table */
ReplicationSlotSyncSkipReason slot_sync_skip_reason;
--
Step 3: Function to convert enum to human-readable string that can be stored in pg_replication_slots.
/*
* Convert ReplicationSlotSyncSkipReason bitmask to human-readable string.
*
* Returns a palloc'd string; caller is responsible for freeing it.
*/
static char *
replication_slot_sync_skip_reason_str(ReplicationSlotSyncSkipReason reason)
{
StringInfoData buf;
initStringInfo(&buf);
if (reason == RS_SYNC_SKIP_NONE)
{
appendStringInfoString(&buf, "none");
return buf.data;
}
if (reason & RS_SYNC_SKIP_REMOTE_BEHIND)
appendStringInfoString(&buf, "remote_behind|");
if (reason & RS_SYNC_SKIP_DATA_LOSS)
appendStringInfoString(&buf, "data_loss|");
if (reason & RS_SYNC_SKIP_NO_SNAPSHOT)
appendStringInfoString(&buf, "no_snapshot|");
/* Remove trailing '|' */
if (buf.len > 0 && buf.data[buf.len - 1] == '|')
buf.data[buf.len - 1] = '\0';
return buf.data;
}
--
Step 4: Capture slot_sync_skip_reason whenever the relevant LOG messages are generated, primarily inside update_local_synced_slot or update_and_persist_local_synced_slot. This value will can later be persisted in the pg_replication_slots catalog.
--
Please let me know if you have any objections. I’ll share the wip patch in a few days.
--
With Regards,
Ashutosh Sharma.
Hi Ashutosh, I am also interested in this thread. And was working on a patch for it. On Wed, 3 Sept 2025 at 17:52, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Hi Amit, > > On Thu, Aug 28, 2025 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> On Thu, Aug 28, 2025 at 11:07 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >> > >> > We have seen cases where slot synchronization gets delayed, for example when the slot is behind the failover standbyor vice versa, and the slot sync worker has to wait for one to catch up with the other. During this waiting period,users querying pg_replication_slots can only see whether the slot has been synchronized or not. If it has alreadysynchronized, that’s fine, but if synchronization is taking longer, users would naturally want to understand the reasonfor the delay. >> > >> > Is there a way for end users to know the cause of slot synchronization delays, so they can take appropriate actionsto speed it up? >> > >> > I understand that server logs are emitted in such cases, but logs are not something end users would want to check regularly.Moreover, since logging is configuration-based, relevant messages may sometimes be skipped or suppressed. >> > >> >> Currently, the way to see the reason for sync skip is LOGs but I think >> it is better to add a new column like sync_skip_reason in >> pg_replication_slots. This can show the reasons like >> standby_LSN_ahead_remote_LSN. I think ideally users can compare >> standby's slot LSN/XMIN with remote_slot being synced. Do you have any >> better ideas? >> > > I have similar thoughts, but for clarity, I’d like to outline some of the key steps I plan to take: > > Step 1: Define an enum for all possible reasons a slot persistence was skipped. > > /* > * Reasons why a replication slot sync was skipped. > */ > typedef enum ReplicationSlotSyncSkipReason > { > RS_SYNC_SKIP_NONE = 0, /* No skip */ > > RS_SYNC_SKIP_REMOTE_BEHIND = (1 << 0), /* Remote slot is behind local reserved LSN */ > > RS_SYNC_SKIP_DATA_LOSS = (1 << 1), /* Local slot ahead of remote, risk of data loss */ > > RS_SYNC_SKIP_NO_SNAPSHOT = (1 << 2) /* Standby could not build a consistent snapshot */ > } ReplicationSlotSyncSkipReason; > > -- > I think we should also add the case when "remote_slot->confirmed_lsn > latestFlushPtr" (WAL corresponding to the confirmed lsn on remote slot is still not flushed on the Standby). In this case as well we are skipping the slot sync. > Step 2: Introduce new column to pg_replication_slots to store the skip reason > > /* Inside pg_replication_slots table */ > ReplicationSlotSyncSkipReason slot_sync_skip_reason; > > -- > As per the discussion [1], I think it is more of stat related data and we should add it in the pg_stat_replication_slots view. Also we can add columns for 'slot sync skip count' and 'last slot sync skip'. Thoughts? > Step 3: Function to convert enum to human-readable string that can be stored in pg_replication_slots. > > /* > * Convert ReplicationSlotSyncSkipReason bitmask to human-readable string. > * > * Returns a palloc'd string; caller is responsible for freeing it. > */ > static char * > replication_slot_sync_skip_reason_str(ReplicationSlotSyncSkipReason reason) > { > StringInfoData buf; > initStringInfo(&buf); > > if (reason == RS_SYNC_SKIP_NONE) > { > appendStringInfoString(&buf, "none"); > return buf.data; > } > > if (reason & RS_SYNC_SKIP_REMOTE_BEHIND) > appendStringInfoString(&buf, "remote_behind|"); > if (reason & RS_SYNC_SKIP_DATA_LOSS) > appendStringInfoString(&buf, "data_loss|"); > if (reason & RS_SYNC_SKIP_NO_SNAPSHOT) > appendStringInfoString(&buf, "no_snapshot|"); > > /* Remove trailing '|' */ > if (buf.len > 0 && buf.data[buf.len - 1] == '|') > buf.data[buf.len - 1] = '\0'; > > return buf.data; > } > > -- > Why are we showing the cause of the slot sync delay as an aggregate of all causes occuring? I thought we should show the reason for the last slot sync delay? > Step 4: Capture slot_sync_skip_reason whenever the relevant LOG messages are generated, primarily inside update_local_synced_slotor update_and_persist_local_synced_slot. This value will can later be persisted in the pg_replication_slotscatalog. > > -- > > Please let me know if you have any objections. I’ll share the wip patch in a few days. > > -- I have attached a patch which I have worked on. Thanks, Shlok Kyal
Вложения
Hi Shlok,
Good to hear that you’re also interested in working on this task.
On Thu, Sep 4, 2025 at 8:26 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
Hi Ashutosh,
I am also interested in this thread. And was working on a patch for it.
On Wed, 3 Sept 2025 at 17:52, Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>
> Hi Amit,
>
> On Thu, Aug 28, 2025 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Thu, Aug 28, 2025 at 11:07 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote:
>> >
>> > We have seen cases where slot synchronization gets delayed, for example when the slot is behind the failover standby or vice versa, and the slot sync worker has to wait for one to catch up with the other. During this waiting period, users querying pg_replication_slots can only see whether the slot has been synchronized or not. If it has already synchronized, that’s fine, but if synchronization is taking longer, users would naturally want to understand the reason for the delay.
>> >
>> > Is there a way for end users to know the cause of slot synchronization delays, so they can take appropriate actions to speed it up?
>> >
>> > I understand that server logs are emitted in such cases, but logs are not something end users would want to check regularly. Moreover, since logging is configuration-based, relevant messages may sometimes be skipped or suppressed.
>> >
>>
>> Currently, the way to see the reason for sync skip is LOGs but I think
>> it is better to add a new column like sync_skip_reason in
>> pg_replication_slots. This can show the reasons like
>> standby_LSN_ahead_remote_LSN. I think ideally users can compare
>> standby's slot LSN/XMIN with remote_slot being synced. Do you have any
>> better ideas?
>>
>
> I have similar thoughts, but for clarity, I’d like to outline some of the key steps I plan to take:
>
> Step 1: Define an enum for all possible reasons a slot persistence was skipped.
>
> /*
> * Reasons why a replication slot sync was skipped.
> */
> typedef enum ReplicationSlotSyncSkipReason
> {
> RS_SYNC_SKIP_NONE = 0, /* No skip */
>
> RS_SYNC_SKIP_REMOTE_BEHIND = (1 << 0), /* Remote slot is behind local reserved LSN */
>
> RS_SYNC_SKIP_DATA_LOSS = (1 << 1), /* Local slot ahead of remote, risk of data loss */
>
> RS_SYNC_SKIP_NO_SNAPSHOT = (1 << 2) /* Standby could not build a consistent snapshot */
> } ReplicationSlotSyncSkipReason;
>
> --
>
I think we should also add the case when "remote_slot->confirmed_lsn >
latestFlushPtr" (WAL corresponding to the confirmed lsn on remote slot
is still not flushed on the Standby). In this case as well we are
skipping the slot sync.
Yes, we can include this case as well.
> Step 2: Introduce new column to pg_replication_slots to store the skip reason
>
> /* Inside pg_replication_slots table */
> ReplicationSlotSyncSkipReason slot_sync_skip_reason;
>
> --
>
As per the discussion [1], I think it is more of stat related data and
we should add it in the pg_stat_replication_slots view. Also we can
add columns for 'slot sync skip count' and 'last slot sync skip'.
Thoughts?
It’s not a bad choice, but what makes it a bit confusing for me is that some of the slot sync information is stored in pg_replication_slots, while some is in pg_stat_replication_slots.
Is there a possibility that when an end user queries pg_replication_slots, it shows a particular slot as synced, but querying pg_stat_replication_slots instead reveals a sync skip reason, or the other way around?
Moreover, these views are primary data sources for end users, and the information is critical for their operations. Splitting related information across multiple views could increase the complexity of their queries.
Is there a possibility that when an end user queries pg_replication_slots, it shows a particular slot as synced, but querying pg_stat_replication_slots instead reveals a sync skip reason, or the other way around?
Moreover, these views are primary data sources for end users, and the information is critical for their operations. Splitting related information across multiple views could increase the complexity of their queries.
> Step 3: Function to convert enum to human-readable string that can be stored in pg_replication_slots.
>
> /*
> * Convert ReplicationSlotSyncSkipReason bitmask to human-readable string.
> *
> * Returns a palloc'd string; caller is responsible for freeing it.
> */
> static char *
> replication_slot_sync_skip_reason_str(ReplicationSlotSyncSkipReason reason)
> {
> StringInfoData buf;
> initStringInfo(&buf);
>
> if (reason == RS_SYNC_SKIP_NONE)
> {
> appendStringInfoString(&buf, "none");
> return buf.data;
> }
>
> if (reason & RS_SYNC_SKIP_REMOTE_BEHIND)
> appendStringInfoString(&buf, "remote_behind|");
> if (reason & RS_SYNC_SKIP_DATA_LOSS)
> appendStringInfoString(&buf, "data_loss|");
> if (reason & RS_SYNC_SKIP_NO_SNAPSHOT)
> appendStringInfoString(&buf, "no_snapshot|");
>
> /* Remove trailing '|' */
> if (buf.len > 0 && buf.data[buf.len - 1] == '|')
> buf.data[buf.len - 1] = '\0';
>
> return buf.data;
> }
>
> --
>
Why are we showing the cause of the slot sync delay as an aggregate of
all causes occuring? I thought we should show the reason for the last
slot sync delay?
Yes we should just be showing the reason for the last sync skip, no aggregation is needed here.
> Step 4: Capture slot_sync_skip_reason whenever the relevant LOG messages are generated, primarily inside update_local_synced_slot or update_and_persist_local_synced_slot. This value will can later be persisted in the pg_replication_slots catalog.
>
> --
>
> Please let me know if you have any objections. I’ll share the wip patch in a few days.
>
> --
I have attached a patch which I have worked on.
Thanks, I will look into it, in fact I have already looked into it, but before I make any comments, I think maybe we should try to finalize the approach first.
--
With Regards,
Ashutosh Sharma.
On Fri, Sep 5, 2025 at 12:50 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Good to hear that you’re also interested in working on this task. > > On Thu, Sep 4, 2025 at 8:26 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: >> >> Hi Ashutosh, >> >> I am also interested in this thread. And was working on a patch for it. >> >> On Wed, 3 Sept 2025 at 17:52, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >> > >> > Hi Amit, >> > >> > On Thu, Aug 28, 2025 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> >> >> On Thu, Aug 28, 2025 at 11:07 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: >> >> > >> >> > We have seen cases where slot synchronization gets delayed, for example when the slot is behind the failover standbyor vice versa, and the slot sync worker has to wait for one to catch up with the other. During this waiting period,users querying pg_replication_slots can only see whether the slot has been synchronized or not. If it has alreadysynchronized, that’s fine, but if synchronization is taking longer, users would naturally want to understand the reasonfor the delay. >> >> > >> >> > Is there a way for end users to know the cause of slot synchronization delays, so they can take appropriate actionsto speed it up? >> >> > >> >> > I understand that server logs are emitted in such cases, but logs are not something end users would want to checkregularly. Moreover, since logging is configuration-based, relevant messages may sometimes be skipped or suppressed. >> >> > >> >> >> >> Currently, the way to see the reason for sync skip is LOGs but I think >> >> it is better to add a new column like sync_skip_reason in >> >> pg_replication_slots. This can show the reasons like >> >> standby_LSN_ahead_remote_LSN. I think ideally users can compare >> >> standby's slot LSN/XMIN with remote_slot being synced. Do you have any >> >> better ideas? >> >> >> > >> > I have similar thoughts, but for clarity, I’d like to outline some of the key steps I plan to take: >> > >> > Step 1: Define an enum for all possible reasons a slot persistence was skipped. >> > >> > /* >> > * Reasons why a replication slot sync was skipped. >> > */ >> > typedef enum ReplicationSlotSyncSkipReason >> > { >> > RS_SYNC_SKIP_NONE = 0, /* No skip */ >> > >> > RS_SYNC_SKIP_REMOTE_BEHIND = (1 << 0), /* Remote slot is behind local reserved LSN */ >> > >> > RS_SYNC_SKIP_DATA_LOSS = (1 << 1), /* Local slot ahead of remote, risk of data loss */ >> > >> > RS_SYNC_SKIP_NO_SNAPSHOT = (1 << 2) /* Standby could not build a consistent snapshot */ >> > } ReplicationSlotSyncSkipReason; >> > >> > -- >> > >> I think we should also add the case when "remote_slot->confirmed_lsn > >> latestFlushPtr" (WAL corresponding to the confirmed lsn on remote slot >> is still not flushed on the Standby). In this case as well we are >> skipping the slot sync. > > > Yes, we can include this case as well. > >> >> >> > Step 2: Introduce new column to pg_replication_slots to store the skip reason >> > >> > /* Inside pg_replication_slots table */ >> > ReplicationSlotSyncSkipReason slot_sync_skip_reason; >> > >> > -- >> > >> As per the discussion [1], I think it is more of stat related data and >> we should add it in the pg_stat_replication_slots view. Also we can >> add columns for 'slot sync skip count' and 'last slot sync skip'. >> Thoughts? > > > It’s not a bad choice, but what makes it a bit confusing for me is that some of the slot sync information is stored inpg_replication_slots, while some is in pg_stat_replication_slots. > How about keeping sync_skip_reason in pg_replication_slots and sync_skip_count in pg_stat_replication_slots? -- With Regards, Amit Kapila.
Hi Amit, On Sat, Sep 6, 2025 at 10:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Sep 5, 2025 at 12:50 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > Good to hear that you’re also interested in working on this task. > > > > On Thu, Sep 4, 2025 at 8:26 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > >> > >> Hi Ashutosh, > >> > >> I am also interested in this thread. And was working on a patch for it. > >> > >> On Wed, 3 Sept 2025 at 17:52, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > >> > > >> > Hi Amit, > >> > > >> > On Thu, Aug 28, 2025 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > >> >> > >> >> On Thu, Aug 28, 2025 at 11:07 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > >> >> > > >> >> > We have seen cases where slot synchronization gets delayed, for example when the slot is behind the failover standbyor vice versa, and the slot sync worker has to wait for one to catch up with the other. During this waiting period,users querying pg_replication_slots can only see whether the slot has been synchronized or not. If it has alreadysynchronized, that’s fine, but if synchronization is taking longer, users would naturally want to understand the reasonfor the delay. > >> >> > > >> >> > Is there a way for end users to know the cause of slot synchronization delays, so they can take appropriate actionsto speed it up? > >> >> > > >> >> > I understand that server logs are emitted in such cases, but logs are not something end users would want to checkregularly. Moreover, since logging is configuration-based, relevant messages may sometimes be skipped or suppressed. > >> >> > > >> >> > >> >> Currently, the way to see the reason for sync skip is LOGs but I think > >> >> it is better to add a new column like sync_skip_reason in > >> >> pg_replication_slots. This can show the reasons like > >> >> standby_LSN_ahead_remote_LSN. I think ideally users can compare > >> >> standby's slot LSN/XMIN with remote_slot being synced. Do you have any > >> >> better ideas? > >> >> > >> > > >> > I have similar thoughts, but for clarity, I’d like to outline some of the key steps I plan to take: > >> > > >> > Step 1: Define an enum for all possible reasons a slot persistence was skipped. > >> > > >> > /* > >> > * Reasons why a replication slot sync was skipped. > >> > */ > >> > typedef enum ReplicationSlotSyncSkipReason > >> > { > >> > RS_SYNC_SKIP_NONE = 0, /* No skip */ > >> > > >> > RS_SYNC_SKIP_REMOTE_BEHIND = (1 << 0), /* Remote slot is behind local reserved LSN */ > >> > > >> > RS_SYNC_SKIP_DATA_LOSS = (1 << 1), /* Local slot ahead of remote, risk of data loss */ > >> > > >> > RS_SYNC_SKIP_NO_SNAPSHOT = (1 << 2) /* Standby could not build a consistent snapshot */ > >> > } ReplicationSlotSyncSkipReason; > >> > > >> > -- > >> > > >> I think we should also add the case when "remote_slot->confirmed_lsn > > >> latestFlushPtr" (WAL corresponding to the confirmed lsn on remote slot > >> is still not flushed on the Standby). In this case as well we are > >> skipping the slot sync. > > > > > > Yes, we can include this case as well. > > > >> > >> > >> > Step 2: Introduce new column to pg_replication_slots to store the skip reason > >> > > >> > /* Inside pg_replication_slots table */ > >> > ReplicationSlotSyncSkipReason slot_sync_skip_reason; > >> > > >> > -- > >> > > >> As per the discussion [1], I think it is more of stat related data and > >> we should add it in the pg_stat_replication_slots view. Also we can > >> add columns for 'slot sync skip count' and 'last slot sync skip'. > >> Thoughts? > > > > > > It’s not a bad choice, but what makes it a bit confusing for me is that some of the slot sync information is stored inpg_replication_slots, while some is in pg_stat_replication_slots. > > > > How about keeping sync_skip_reason in pg_replication_slots and > sync_skip_count in pg_stat_replication_slots? > I think we can do that, since sync_skip_reason appears to be a descriptive metadata rather than statistical data like slot_sync_skip_count and last_slot_sync_skip. However, it's also true that all three pieces of data are transient by nature - they will just be present in the runtime. -- With Regards, Ashutosh Sharma.
Hi Amit, On Mon, Sep 8, 2025 at 9:52 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Hi Amit, > > On Sat, Sep 6, 2025 at 10:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Sep 5, 2025 at 12:50 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > > > Good to hear that you’re also interested in working on this task. > > > > > > On Thu, Sep 4, 2025 at 8:26 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > >> > > >> Hi Ashutosh, > > >> > > >> I am also interested in this thread. And was working on a patch for it. > > >> > > >> On Wed, 3 Sept 2025 at 17:52, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > >> > > > >> > Hi Amit, > > >> > > > >> > On Thu, Aug 28, 2025 at 3:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > >> >> > > >> >> On Thu, Aug 28, 2025 at 11:07 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > >> >> > > > >> >> > We have seen cases where slot synchronization gets delayed, for example when the slot is behind the failoverstandby or vice versa, and the slot sync worker has to wait for one to catch up with the other. During this waitingperiod, users querying pg_replication_slots can only see whether the slot has been synchronized or not. If it hasalready synchronized, that’s fine, but if synchronization is taking longer, users would naturally want to understand thereason for the delay. > > >> >> > > > >> >> > Is there a way for end users to know the cause of slot synchronization delays, so they can take appropriate actionsto speed it up? > > >> >> > > > >> >> > I understand that server logs are emitted in such cases, but logs are not something end users would want to checkregularly. Moreover, since logging is configuration-based, relevant messages may sometimes be skipped or suppressed. > > >> >> > > > >> >> > > >> >> Currently, the way to see the reason for sync skip is LOGs but I think > > >> >> it is better to add a new column like sync_skip_reason in > > >> >> pg_replication_slots. This can show the reasons like > > >> >> standby_LSN_ahead_remote_LSN. I think ideally users can compare > > >> >> standby's slot LSN/XMIN with remote_slot being synced. Do you have any > > >> >> better ideas? > > >> >> > > >> > > > >> > I have similar thoughts, but for clarity, I’d like to outline some of the key steps I plan to take: > > >> > > > >> > Step 1: Define an enum for all possible reasons a slot persistence was skipped. > > >> > > > >> > /* > > >> > * Reasons why a replication slot sync was skipped. > > >> > */ > > >> > typedef enum ReplicationSlotSyncSkipReason > > >> > { > > >> > RS_SYNC_SKIP_NONE = 0, /* No skip */ > > >> > > > >> > RS_SYNC_SKIP_REMOTE_BEHIND = (1 << 0), /* Remote slot is behind local reserved LSN */ > > >> > > > >> > RS_SYNC_SKIP_DATA_LOSS = (1 << 1), /* Local slot ahead of remote, risk of data loss */ > > >> > > > >> > RS_SYNC_SKIP_NO_SNAPSHOT = (1 << 2) /* Standby could not build a consistent snapshot */ > > >> > } ReplicationSlotSyncSkipReason; > > >> > > > >> > -- > > >> > > > >> I think we should also add the case when "remote_slot->confirmed_lsn > > > >> latestFlushPtr" (WAL corresponding to the confirmed lsn on remote slot > > >> is still not flushed on the Standby). In this case as well we are > > >> skipping the slot sync. > > > > > > > > > Yes, we can include this case as well. > > > > > >> > > >> > > >> > Step 2: Introduce new column to pg_replication_slots to store the skip reason > > >> > > > >> > /* Inside pg_replication_slots table */ > > >> > ReplicationSlotSyncSkipReason slot_sync_skip_reason; > > >> > > > >> > -- > > >> > > > >> As per the discussion [1], I think it is more of stat related data and > > >> we should add it in the pg_stat_replication_slots view. Also we can > > >> add columns for 'slot sync skip count' and 'last slot sync skip'. > > >> Thoughts? > > > > > > > > > It’s not a bad choice, but what makes it a bit confusing for me is that some of the slot sync information is storedin pg_replication_slots, while some is in pg_stat_replication_slots. > > > > > > > How about keeping sync_skip_reason in pg_replication_slots and > > sync_skip_count in pg_stat_replication_slots? > > > > I think we can do that, since sync_skip_reason appears to be a > descriptive metadata rather than statistical data like > slot_sync_skip_count and last_slot_sync_skip. However, it's also true > that all three pieces of data are transient by nature - they will just > be present in the runtime. > After spending some more time on this, I found that maintaining sync_skip_reason in pg_replication_slots would make the code changes a bit messy and harder to maintain. I think storing all 3 pieces of information - sync_skip_reason, sync_skip_count, and last_sync_skip in pg_stat_replication_slots would be a cleaner solution. This way, all the sync-related info stays together and the code remains straightforward. @Sholk, do let me know if you agree with this? -- With Regards, Ashutosh Sharma.
On Fri, Sep 12, 2025 at 1:07 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > On Mon, Sep 8, 2025 at 9:52 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > I think we can do that, since sync_skip_reason appears to be a > > descriptive metadata rather than statistical data like > > slot_sync_skip_count and last_slot_sync_skip. However, it's also true > > that all three pieces of data are transient by nature - they will just > > be present in the runtime. > > > > After spending some more time on this, I found that maintaining > sync_skip_reason in pg_replication_slots would make the code changes a > bit messy and harder to maintain. > What exactly is your worry? It seems more logical to have sync_skip_reason in pg_replication_slots and other two in pg_stat_replication_slots as the latter is purely a stats view and the sync_skip_count/last_sync_skip suits there better. I think storing all 3 pieces of > information - sync_skip_reason, sync_skip_count, and last_sync_skip in > pg_stat_replication_slots would be a cleaner solution. This way, all > the sync-related info stays together and the code remains > straightforward. > Having all the sync information together has merit but don't you think in this case the sync_skip_reason doesn't seem to be matching with the existing columns in pg_stat_replication_slots? -- With Regards, Amit Kapila.
Hi Amit, On Fri, Sep 12, 2025 at 4:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Sep 12, 2025 at 1:07 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > On Mon, Sep 8, 2025 at 9:52 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > > > I think we can do that, since sync_skip_reason appears to be a > > > descriptive metadata rather than statistical data like > > > slot_sync_skip_count and last_slot_sync_skip. However, it's also true > > > that all three pieces of data are transient by nature - they will just > > > be present in the runtime. > > > > > > > After spending some more time on this, I found that maintaining > > sync_skip_reason in pg_replication_slots would make the code changes a > > bit messy and harder to maintain. > > > > What exactly is your worry? It seems more logical to have > sync_skip_reason in pg_replication_slots and other two in > pg_stat_replication_slots as the latter is purely a stats view and the > sync_skip_count/last_sync_skip suits there better. > The code changes for adding the skip reason to pg_replication_slots feel a bit hacky compared to the approach for incorporating all three pieces of information in pg_stat_replication_slots. I thought many might prefer simplicity over hackiness, which is why having everything in pg_stat_replication_slots could be more acceptable. That said, we could maybe prepare a POC patch with this approach as well, compare the two, and then decide which path to take. -- With Regards, Ashutosh Sharma.
Hi Amit, Ashutosh, On Fri, 12 Sept 2025 at 17:28, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Hi Amit, > > On Fri, Sep 12, 2025 at 4:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Sep 12, 2025 at 1:07 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > > > On Mon, Sep 8, 2025 at 9:52 AM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > > > > > I think we can do that, since sync_skip_reason appears to be a > > > > descriptive metadata rather than statistical data like > > > > slot_sync_skip_count and last_slot_sync_skip. However, it's also true > > > > that all three pieces of data are transient by nature - they will just > > > > be present in the runtime. > > > > > > > > > > After spending some more time on this, I found that maintaining > > > sync_skip_reason in pg_replication_slots would make the code changes a > > > bit messy and harder to maintain. > > > > > > > What exactly is your worry? It seems more logical to have > > sync_skip_reason in pg_replication_slots and other two in > > pg_stat_replication_slots as the latter is purely a stats view and the > > sync_skip_count/last_sync_skip suits there better. > > > > The code changes for adding the skip reason to pg_replication_slots > feel a bit hacky compared to the approach for incorporating all three > pieces of information in pg_stat_replication_slots. I thought many > might prefer simplicity over hackiness, which is why having everything > in pg_stat_replication_slots could be more acceptable. That said, we > could maybe prepare a POC patch with this approach as well, compare > the two, and then decide which path to take. > Here are my thought on this: I believe that if we decide to keep skip_reason in pg_stat_replication_slot, it should mean "reason of last slot sync skip" as I think it would make more sense in the sense of statistics. And if we decide to keep skip_reason in pg_replication_slot, it would be more appropriate to keep the latest slot data ( It should display skip_reason only if the current slot sync cycle is skipped). This is my observation based on the behaviour of current columns in these views. Thoughts? I have also attached POC patches for both approaches as per discussion above. v2_approach1 : It adds all columns 'slot_sync_skip_reason', 'slot_sync_skip_count' and 'last_slot_sync_skip' to pg_stat_replication_slots v2_approach2 : It adds column 'slot_sync_skip_reason' to pg_replication_slots and columns 'slot_sync_skip_count' and 'last_slot_sync_skip' to pg_stat_replication_slots Thanks, Shlok Kyal
Вложения
Dear Shlok, Thanks for creating the patch. Personally I prefer approach2; approach1 cannot indicate the current status of synchronization, it just shows the history. I feel approach2 has more information than approach1. Below are my comments for your patch. 01. ``` + Number of times the slot sync is skipped. ``` "slot sync" should be "slot synchronization". Same thing can be saied for other attributes. 02. ``` + Reason of the last slot sync skip. ``` Possible values must be clarified. 03. ``` + s.slot_sync_skip_count, + s.last_slot_sync_skip, + s.slot_sync_skip_reason, ``` Last line has tab-blank but others have space-blank. 04. ``` +typedef enum SlotSyncSkipReason +{ + SLOT_SYNC_SKIP_NONE, /* No skip */ + SLOT_SYNC_SKIP_STANDBY_BEHIND, /* Standby is behind the remote slot */ + SLOT_SYNC_SKIP_REMOTE_BEHIND, /* Remote slot is behind the local slot */ + SLOT_SYNC_SKIP_NO_CONSISTENT_SNAPSHOT /* Standby could not reach a + * consistent snapshot */ +} SlotSyncSkipReason ``` a. Can we add comment atop the enum? b. SLOT_SYNC_SKIP_STANDBY_BEHIND is misleading; it indicates the standby server has not received WALs required by slots, but enum does not clarify it. How about SLOT_SYNC_SKIP_MISSING_WAL_RECORDS or something? Better naming are very welcome. c s/reach/build/. 05. ``` @@ -646,11 +652,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid) remote_slot->name, LSN_FORMAT_ARGS(latestFlushPtr))); - return false; + /* If the slot is not present on the local */ + if (!(slot = SearchNamedReplicationSlot(remote_slot->name, true))) + return false; } ``` Looks like if users try to sync slots via SQL interfaces, the statistics cannot be updated. 06. update_slot_sync_skip_stats ``` + /* Update the slot sync reason */ + SpinLockAcquire(&slot->mutex); + slot->slot_sync_skip_reason = skip_reason; + SpinLockRelease(&slot->mutex); ``` I feel no need to update the reason if the slot->slot_sync_skip_reason and skip_reason are the same. 07. synchronize_one_slot ``` + /* + * If standby is behind remote slot and the synced slot is present on + * local. + */ + if (remote_slot->confirmed_lsn > latestFlushPtr) + { + if (synced) + update_slot_sync_skip_stats(slot, skip_reason); + return false; + } ``` This condition exist in the same function; can we combine? 08. GetSlotSyncSkipReason() Do we have to do pstrdup() here? I found a similar function get_snapbuild_state_desc(), and it does not use. 09. Can you consider a test for added codes? Best regards, Hayato Kuroda FUJITSU LIMITED
On Wed, Sep 17, 2025 at 4:24 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Shlok, > > Thanks for creating the patch. Personally I prefer approach2; approach1 cannot > indicate the current status of synchronization, it just shows the history. > I feel approach2 has more information than approach1. > I also think so but Ashutosh thought that it would be hacky. Ashutosh, did you have an opinion on this matter after seeing the patches? -- With Regards, Amit Kapila.
Hi Amit, On Wed, Sep 17, 2025 at 5:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Sep 17, 2025 at 4:24 PM Hayato Kuroda (Fujitsu) > <kuroda.hayato@fujitsu.com> wrote: > > > > Dear Shlok, > > > > Thanks for creating the patch. Personally I prefer approach2; approach1 cannot > > indicate the current status of synchronization, it just shows the history. > > I feel approach2 has more information than approach1. > > > > I also think so but Ashutosh thought that it would be hacky. Ashutosh, > did you have an opinion on this matter after seeing the patches? > Yes, I’ve looked into both the patches. Approach 1 seems quite straightforward. In approach 2, we need to pass some additional arguments to update_local_sync_slot and update_and_persist_local_synced_slot, which makes it feel a little less clean compared to approach 1, where we simply add a new function and call it directly. That said, this is just my view on code cleanliness, I’m fine with proceeding with approach 2 if that’s considered the better option. -- With Regards, Ashutosh Sharma,
On Wed, Sep 17, 2025 at 8:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > On Wed, Sep 17, 2025 at 5:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Sep 17, 2025 at 4:24 PM Hayato Kuroda (Fujitsu) > > <kuroda.hayato@fujitsu.com> wrote: > > > > > > Dear Shlok, > > > > > > Thanks for creating the patch. Personally I prefer approach2; approach1 cannot > > > indicate the current status of synchronization, it just shows the history. > > > I feel approach2 has more information than approach1. > > > > > > > I also think so but Ashutosh thought that it would be hacky. Ashutosh, > > did you have an opinion on this matter after seeing the patches? > > > > Yes, I’ve looked into both the patches. Approach 1 seems quite > straightforward. In approach 2, we need to pass some additional > arguments to update_local_sync_slot and > update_and_persist_local_synced_slot, which makes it feel a little > less clean compared to approach 1, where we simply add a new function > and call it directly. > This is because the approach-1 doesn't show the latest value of sync_status. I mean in the latest cycle if the sync is successful, it won't update the stats which I am not sure is correct because users may want to know the recent status of sync cycle. Otherwise, the patch should be almost the same. I think we can even try to write a patch for approach-2 without an additional out parameter in some of the functions. -- With Regards, Amit Kapila.
Hi Amit, On Thu, Sep 18, 2025 at 11:31 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Sep 17, 2025 at 8:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > On Wed, Sep 17, 2025 at 5:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Wed, Sep 17, 2025 at 4:24 PM Hayato Kuroda (Fujitsu) > > > <kuroda.hayato@fujitsu.com> wrote: > > > > > > > > Dear Shlok, > > > > > > > > Thanks for creating the patch. Personally I prefer approach2; approach1 cannot > > > > indicate the current status of synchronization, it just shows the history. > > > > I feel approach2 has more information than approach1. > > > > > > > > > > I also think so but Ashutosh thought that it would be hacky. Ashutosh, > > > did you have an opinion on this matter after seeing the patches? > > > > > > > Yes, I’ve looked into both the patches. Approach 1 seems quite > > straightforward. In approach 2, we need to pass some additional > > arguments to update_local_sync_slot and > > update_and_persist_local_synced_slot, which makes it feel a little > > less clean compared to approach 1, where we simply add a new function > > and call it directly. > > > > This is because the approach-1 doesn't show the latest value of > sync_status. I mean in the latest cycle if the sync is successful, it > won't update the stats which I am not sure is correct because users > may want to know the recent status of sync cycle. Otherwise, the patch > should be almost the same. This should be manageable, no? If we add an additional call to the stats report function immediately after ReplicationSlotPersist(), wouldn’t that address the issue? Please correct me if I’m overlooking something. @@ -600,6 +600,8 @@ update_and_persist_local_synced_slot(RemoteSlot *remote_slot, Oid remote_dbid) ReplicationSlotPersist(); + pgstat_report_replslot_sync_skip(slot, SLOT_SYNC_SKIP_NONE); + ereport(LOG, errmsg("newly created replication slot \"%s\" is sync-ready now", remote_slot->name)); In addition to this, should anyone really need to query the skip reason if pg_replication_slots already shows that the slot is synced and not temporary? Ideally, users should check the slot status in pg_replication_slots, and if it indicates the slot is persisted, there seems little value in enquiring pg_stat_replication_slots for the skip reason. That said, it’s important to ensure the information in both views remains consistent. > I think we can even try to write a patch > for approach-2 without an additional out parameter in some of the > functions. We can aim for this, if possible. -- With Regards, Ashutosh Sharma.
Hi, @@ -646,7 +670,11 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid) remote_slot->name, LSN_FORMAT_ARGS(latestFlushPtr))); - return false; + /* If slot is present on the local, update the slot sync skip stats */ + if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) + skip_reason = SLOT_SYNC_SKIP_STANDBY_BEHIND; + else + return false; With this change, you’re likely enforcing sync slot creation, whereas earlier that might not have been the case. This introduces a behavioral change, which may not be well received. -- I think we can avoid passing skip_reason as a new argument to update_local_synced_slot(). It only needs to be passed to update_and_persist_local_synced_slot(). When update_local_synced_slot() is invoked from within update_and_persist_local_synced_slot(), we can simply rely on the two flags, remote_slot_precedes and found_consistent_snapshot and set the skip_reason accordingly, thoughts? If update_local_synced_slot is being called from any other place that means the slot is already persisted. -- +typedef enum SlotSyncSkipReason +{ + SLOT_SYNC_SKIP_NONE, /* No skip */ + SLOT_SYNC_SKIP_STANDBY_BEHIND, /* Standby is behind the remote slot */ + SLOT_SYNC_SKIP_REMOTE_BEHIND, /* Remote slot is behind the local slot */ + SLOT_SYNC_SKIP_NO_CONSISTENT_SNAPSHOT /* Standby could not reach a + * consistent snapshot */ +} SlotSyncSkipReason; + I would suggest shortening the enum names like maybe SS_SKIP_NONE instead of SLOT_SYNC_SKIP_NONE. -- With Regards, Ashutosh Sharma.
On Thu, 18 Sept 2025 at 13:17, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Hi Amit, > > On Thu, Sep 18, 2025 at 11:31 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Wed, Sep 17, 2025 at 8:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > > > On Wed, Sep 17, 2025 at 5:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Wed, Sep 17, 2025 at 4:24 PM Hayato Kuroda (Fujitsu) > > > > <kuroda.hayato@fujitsu.com> wrote: > > > > > > > > > > Dear Shlok, > > > > > > > > > > Thanks for creating the patch. Personally I prefer approach2; approach1 cannot > > > > > indicate the current status of synchronization, it just shows the history. > > > > > I feel approach2 has more information than approach1. > > > > > > > > > > > > > I also think so but Ashutosh thought that it would be hacky. Ashutosh, > > > > did you have an opinion on this matter after seeing the patches? > > > > > > > > > > Yes, I’ve looked into both the patches. Approach 1 seems quite > > > straightforward. In approach 2, we need to pass some additional > > > arguments to update_local_sync_slot and > > > update_and_persist_local_synced_slot, which makes it feel a little > > > less clean compared to approach 1, where we simply add a new function > > > and call it directly. > > > > > > > This is because the approach-1 doesn't show the latest value of > > sync_status. I mean in the latest cycle if the sync is successful, it > > won't update the stats which I am not sure is correct because users > > may want to know the recent status of sync cycle. Otherwise, the patch > > should be almost the same. > > This should be manageable, no? If we add an additional call to the > stats report function immediately after ReplicationSlotPersist(), > wouldn’t that address the issue? Please correct me if I’m overlooking > something. > > @@ -600,6 +600,8 @@ update_and_persist_local_synced_slot(RemoteSlot > *remote_slot, Oid remote_dbid) > > ReplicationSlotPersist(); > > + pgstat_report_replslot_sync_skip(slot, SLOT_SYNC_SKIP_NONE); > + > ereport(LOG, > errmsg("newly created replication slot \"%s\" > is sync-ready now", > remote_slot->name)); > Currently, in this code change, the skip_reason is updated to 'none' only when the slot state changes from temporary to persistent. However, this logic does not handle cases where the slot is already a persistent sync slot. I believe we should also sync skip can happen for persistent slots. That means, on a successful slot sync, we should update the skip_reason to 'none' even for slots that are already persistent. > In addition to this, should anyone really need to query the skip > reason if pg_replication_slots already shows that the slot is synced > and not temporary? Ideally, users should check the slot status in > pg_replication_slots, and if it indicates the slot is persisted, there > seems little value in enquiring pg_stat_replication_slots for the skip > reason. That said, it’s important to ensure the information in both > views remains consistent. > I have a doubt. Why don't we want to report the sync skip reason once the slots are persisted? for the case: latestFlushPtr = GetStandbyFlushRecPtr(NULL); if (remote_slot->confirmed_lsn > latestFlushPtr) { /* * Can get here only if GUC 'synchronized_standby_slots' on the * primary server was not configured correctly. */ ereport(AmLogicalSlotSyncWorkerProcess() ? LOG : ERROR, errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), errmsg("skipping slot synchronization because the received slot sync" " LSN %X/%08X for slot \"%s\" is ahead of the standby position %X/%08X", LSN_FORMAT_ARGS(remote_slot->confirmed_lsn), remote_slot->name, LSN_FORMAT_ARGS(latestFlushPtr))); return false; } Slot sync skip can happen even for persistent slots. So why should we avoid displaying the skip reason in such cases? I checked that if the synchronized_standby_slots GUC is not set properly, we can hit this condition even for persistent slots. I think we should still display the skip reason if the user does not configure this GUC as expected. > > I think we can even try to write a patch > > for approach-2 without an additional out parameter in some of the > > functions. > > We can aim for this, if possible. Thanks, Shlok Kyal
On Mon, Sep 22, 2025 at 3:41 PM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Thu, 18 Sept 2025 at 13:17, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > > In addition to this, should anyone really need to query the skip > > reason if pg_replication_slots already shows that the slot is synced > > and not temporary? Ideally, users should check the slot status in > > pg_replication_slots, and if it indicates the slot is persisted, there > > seems little value in enquiring pg_stat_replication_slots for the skip > > reason. That said, it’s important to ensure the information in both > > views remains consistent. > > > I have a doubt. Why don't we want to report the sync skip reason once > the slots are persisted? > for the case: > > latestFlushPtr = GetStandbyFlushRecPtr(NULL); > if (remote_slot->confirmed_lsn > latestFlushPtr) > { > /* > * Can get here only if GUC 'synchronized_standby_slots' on the > * primary server was not configured correctly. > */ > ereport(AmLogicalSlotSyncWorkerProcess() ? LOG : ERROR, > errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), > errmsg("skipping slot synchronization because the > received slot sync" > " LSN %X/%08X for slot \"%s\" is ahead of the > standby position %X/%08X", > LSN_FORMAT_ARGS(remote_slot->confirmed_lsn), > remote_slot->name, > LSN_FORMAT_ARGS(latestFlushPtr))); > > return false; > } > > Slot sync skip can happen even for persistent slots. So why should we > avoid displaying the skip reason in such cases? > We should display the skip reason even for persistent slots and clear the same after a successful sync. -- With Regards, Amit Kapila.
On Wed, 17 Sept 2025 at 16:24, Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Shlok, > > Thanks for creating the patch. Personally I prefer approach2; approach1 cannot > indicate the current status of synchronization, it just shows the history. > I feel approach2 has more information than approach1. > I agree with your point. I think the preferred behaviour of slot_sync_skip_reason is to indicate the current status of synchronization. And taking account the current behaviour of columns of views pg_replication_slots and pg_stat_replication_slots, I think slot_sync_skip_reason is suited for pg_replication_slots view and I am proceeding ahead with approach2 > Below are my comments for your patch. > > 01. > ``` > + Number of times the slot sync is skipped. > ``` > > "slot sync" should be "slot synchronization". Same thing can be saied for other > attributes. > > 02. > ``` > + Reason of the last slot sync skip. > ``` > > Possible values must be clarified. > > 03. > ``` > + s.slot_sync_skip_count, > + s.last_slot_sync_skip, > + s.slot_sync_skip_reason, > ``` > > Last line has tab-blank but others have space-blank. > > 04. > ``` > +typedef enum SlotSyncSkipReason > +{ > + SLOT_SYNC_SKIP_NONE, /* No skip */ > + SLOT_SYNC_SKIP_STANDBY_BEHIND, /* Standby is behind the remote slot */ > + SLOT_SYNC_SKIP_REMOTE_BEHIND, /* Remote slot is behind the local slot */ > + SLOT_SYNC_SKIP_NO_CONSISTENT_SNAPSHOT /* Standby could not reach a > + * consistent snapshot */ > +} SlotSyncSkipReason > ``` > > a. > Can we add comment atop the enum? > > b. > SLOT_SYNC_SKIP_STANDBY_BEHIND is misleading; it indicates the standby server has > not received WALs required by slots, but enum does not clarify it. > How about SLOT_SYNC_SKIP_MISSING_WAL_RECORDS or something? Better naming are very > welcome. > > c > s/reach/build/. > > 05. > ``` > @@ -646,11 +652,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid) > remote_slot->name, > LSN_FORMAT_ARGS(latestFlushPtr))); > > - return false; > + /* If the slot is not present on the local */ > + if (!(slot = SearchNamedReplicationSlot(remote_slot->name, true))) > + return false; > } > ``` > > Looks like if users try to sync slots via SQL interfaces, the statistics cannot > be updated. > > 06. update_slot_sync_skip_stats > ``` > + /* Update the slot sync reason */ > + SpinLockAcquire(&slot->mutex); > + slot->slot_sync_skip_reason = skip_reason; > + SpinLockRelease(&slot->mutex); > ``` > > I feel no need to update the reason if the slot->slot_sync_skip_reason > and skip_reason are the same. > > 07. synchronize_one_slot > ``` > + /* > + * If standby is behind remote slot and the synced slot is present on > + * local. > + */ > + if (remote_slot->confirmed_lsn > latestFlushPtr) > + { > + if (synced) > + update_slot_sync_skip_stats(slot, skip_reason); > + return false; > + } > ``` > > This condition exist in the same function; can we combine? > > 08. GetSlotSyncSkipReason() > > Do we have to do pstrdup() here? I found a similar function get_snapbuild_state_desc(), > and it does not use. > I check similar functions and I think we can remove pstrup. Removed in the latest patch. > 09. > > Can you consider a test for added codes? Added test in 0002 patch I have also addressed remaining comments and attached the latest version of patch. Thanks, Shlok Kyal
Вложения
On Thu, 18 Sept 2025 at 11:31, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Sep 17, 2025 at 8:19 PM Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > > > On Wed, Sep 17, 2025 at 5:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Wed, Sep 17, 2025 at 4:24 PM Hayato Kuroda (Fujitsu) > > > <kuroda.hayato@fujitsu.com> wrote: > > > > > > > > Dear Shlok, > > > > > > > > Thanks for creating the patch. Personally I prefer approach2; approach1 cannot > > > > indicate the current status of synchronization, it just shows the history. > > > > I feel approach2 has more information than approach1. > > > > > > > > > > I also think so but Ashutosh thought that it would be hacky. Ashutosh, > > > did you have an opinion on this matter after seeing the patches? > > > > > > > Yes, I’ve looked into both the patches. Approach 1 seems quite > > straightforward. In approach 2, we need to pass some additional > > arguments to update_local_sync_slot and > > update_and_persist_local_synced_slot, which makes it feel a little > > less clean compared to approach 1, where we simply add a new function > > and call it directly. > > > > This is because the approach-1 doesn't show the latest value of > sync_status. I mean in the latest cycle if the sync is successful, it > won't update the stats which I am not sure is correct because users > may want to know the recent status of sync cycle. Otherwise, the patch > should be almost the same. I think we can even try to write a patch > for approach-2 without an additional out parameter in some of the > functions. > Hi Amit, I have written a patch which removes passing this extra parameter to the functions. I have attached the latest patch in [1]. [1]: https://www.postgresql.org/message-id/CANhcyEVFZN2Mkjs0QHshKm2_3AkQ0eufjkD12eL2MeuVkPyGbw%40mail.gmail.com Thanks, Shlok Kyal
Hi Ashutosh, Thanks for reviewing the patch. On Mon, 22 Sept 2025 at 10:59, Ashutosh Sharma <ashu.coek88@gmail.com> wrote: > > Hi, > > @@ -646,7 +670,11 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid > remote_dbid) > remote_slot->name, > LSN_FORMAT_ARGS(latestFlushPtr))); > > - return false; > + /* If slot is present on the local, update the slot sync skip stats */ > + if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) > + skip_reason = SLOT_SYNC_SKIP_STANDBY_BEHIND; > + else > + return false; > > With this change, you’re likely enforcing sync slot creation, whereas > earlier that might not have been the case. This introduces a > behavioral change, which may not be well received. > > -- I have fixed it in latest version of patch > > I think we can avoid passing skip_reason as a new argument to > update_local_synced_slot(). It only needs to be passed to > update_and_persist_local_synced_slot(). When > update_local_synced_slot() is invoked from within > update_and_persist_local_synced_slot(), we can simply rely on the two > flags, remote_slot_precedes and found_consistent_snapshot and set the > skip_reason accordingly, thoughts? > > If update_local_synced_slot is being called from any other place that > means the slot is already persisted. > I came up with a solution on similar lines as above in the attached patch. And also as per Amit's comment in [1], I have kept the behaviour to show skip reason even for persistent slots. > -- > > +typedef enum SlotSyncSkipReason > +{ > + SLOT_SYNC_SKIP_NONE, /* No skip */ > + SLOT_SYNC_SKIP_STANDBY_BEHIND, /* Standby is behind the remote slot */ > + SLOT_SYNC_SKIP_REMOTE_BEHIND, /* Remote slot is behind the local slot */ > + SLOT_SYNC_SKIP_NO_CONSISTENT_SNAPSHOT /* Standby could not reach a > + * consistent snapshot */ > +} SlotSyncSkipReason; > + > > I would suggest shortening the enum names like maybe SS_SKIP_NONE > instead of SLOT_SYNC_SKIP_NONE. > Fixed it. I have attached the latest patch [2]. [1]: https://www.postgresql.org/message-id/CAA4eK1%2B6UsZiN0GnRNue_Vs8007jdcDFetNq%2BapubHcrqzjwpQ%40mail.gmail.com [2]: https://www.postgresql.org/message-id/CANhcyEVFZN2Mkjs0QHshKm2_3AkQ0eufjkD12eL2MeuVkPyGbw%40mail.gmail.com Thanks, Shlok Kyal
On Wed, 24 Sept 2025 at 16:39, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Wed, 17 Sept 2025 at 16:24, Hayato Kuroda (Fujitsu) > <kuroda.hayato@fujitsu.com> wrote: > > > > Dear Shlok, > > > > Thanks for creating the patch. Personally I prefer approach2; approach1 cannot > > indicate the current status of synchronization, it just shows the history. > > I feel approach2 has more information than approach1. > > > I agree with your point. I think the preferred behaviour of > slot_sync_skip_reason is to indicate the current status of > synchronization. And taking account the current behaviour of columns > of views pg_replication_slots and pg_stat_replication_slots, I think > slot_sync_skip_reason is suited for pg_replication_slots view and > I am proceeding ahead with approach2 > > > Below are my comments for your patch. > > > > 01. > > ``` > > + Number of times the slot sync is skipped. > > ``` > > > > "slot sync" should be "slot synchronization". Same thing can be saied for other > > attributes. > > > > 02. > > ``` > > + Reason of the last slot sync skip. > > ``` > > > > Possible values must be clarified. > > > > 03. > > ``` > > + s.slot_sync_skip_count, > > + s.last_slot_sync_skip, > > + s.slot_sync_skip_reason, > > ``` > > > > Last line has tab-blank but others have space-blank. > > > > 04. > > ``` > > +typedef enum SlotSyncSkipReason > > +{ > > + SLOT_SYNC_SKIP_NONE, /* No skip */ > > + SLOT_SYNC_SKIP_STANDBY_BEHIND, /* Standby is behind the remote slot */ > > + SLOT_SYNC_SKIP_REMOTE_BEHIND, /* Remote slot is behind the local slot */ > > + SLOT_SYNC_SKIP_NO_CONSISTENT_SNAPSHOT /* Standby could not reach a > > + * consistent snapshot */ > > +} SlotSyncSkipReason > > ``` > > > > a. > > Can we add comment atop the enum? > > > > b. > > SLOT_SYNC_SKIP_STANDBY_BEHIND is misleading; it indicates the standby server has > > not received WALs required by slots, but enum does not clarify it. > > How about SLOT_SYNC_SKIP_MISSING_WAL_RECORDS or something? Better naming are very > > welcome. > > > > c > > s/reach/build/. > > > > 05. > > ``` > > @@ -646,11 +652,13 @@ synchronize_one_slot(RemoteSlot *remote_slot, Oid remote_dbid) > > remote_slot->name, > > LSN_FORMAT_ARGS(latestFlushPtr))); > > > > - return false; > > + /* If the slot is not present on the local */ > > + if (!(slot = SearchNamedReplicationSlot(remote_slot->name, true))) > > + return false; > > } > > ``` > > > > Looks like if users try to sync slots via SQL interfaces, the statistics cannot > > be updated. > > > > 06. update_slot_sync_skip_stats > > ``` > > + /* Update the slot sync reason */ > > + SpinLockAcquire(&slot->mutex); > > + slot->slot_sync_skip_reason = skip_reason; > > + SpinLockRelease(&slot->mutex); > > ``` > > > > I feel no need to update the reason if the slot->slot_sync_skip_reason > > and skip_reason are the same. > > > > 07. synchronize_one_slot > > ``` > > + /* > > + * If standby is behind remote slot and the synced slot is present on > > + * local. > > + */ > > + if (remote_slot->confirmed_lsn > latestFlushPtr) > > + { > > + if (synced) > > + update_slot_sync_skip_stats(slot, skip_reason); > > + return false; > > + } > > ``` > > > > This condition exist in the same function; can we combine? > > > > 08. GetSlotSyncSkipReason() > > > > Do we have to do pstrdup() here? I found a similar function get_snapbuild_state_desc(), > > and it does not use. > > > I check similar functions and I think we can remove pstrup. Removed in > the latest patch. > > > 09. > > > > Can you consider a test for added codes? > Added test in 0002 patch > > I have also addressed remaining comments and attached the latest > version of patch. > The CF Bot was failing as meson.build was not formatted appropriately. I have fixed it and made some cosmetic changes in the test file. I ran the CF tests on my local repository and it is passing all tests. I have attached the updated version. Thanks, Shlok Kyal
Вложения
Dear Shlok, Thanks for updating the patch. Here are my comments. 01. ``` + /* Update the slot sync reason */ + SpinLockAcquire(&slot->mutex); + if (slot->slot_sync_skip_reason != skip_reason) + slot->slot_sync_skip_reason = skip_reason; + SpinLockRelease(&slot->mutex); ``` Per my understanding, spinlock is acquired when the attribute on the shared memory is updating. Can you check other parts and follow the rukle? 02. ``` + SpinLockAcquire(&slot->mutex); + synced = slot->data.synced; + SpinLockRelease(&slot->mutex); ``` Same as 1. 03. ``` ``` 04. ``` +#ifdef USE_INJECTION_POINTS + INJECTION_POINT("slot-sync-skip", NULL); +#endif ``` No need do do #ifdef here. INJECTION_POINT() itself checks internally. 05. ``` +# Initialize the primary cluster +my $primary = PostgreSQL::Test::Cluster->new('publisher'); +$primary->init( + allows_streaming => 'logical', + auth_extra => [ '--create-role' => 'repl_role' ]); ``` Do we have to create repl_role? I'm not sure where it's used. 06. ``` +$primary->append_conf( + 'postgresql.conf', qq{ +autovacuum = off +max_prepared_transactions = 1 +}); ``` Do we have to set max_prepared_transactions? PREPARE command is not used. 07. ``` +# Load the injection_points extension +$primary->safe_psql('postgres', q(CREATE EXTENSION injection_points)) ``` We must check whether the injection_points is available or not. See 047_checkpoint_physical_slot.pl. 08. ``` +# Initialize standby from primary backup +my $standby1 = PostgreSQL::Test::Cluster->new('standby1'); +$standby1->init_from_backup( + $primary, $backup_name, + has_streaming => 1, + has_restoring => 1); ``` To clarify, is there a reason why we set has_restoring? Can we remove it? 09. ``` +my $connstr_1 = $primary->connstr; ``` Since this is an only connection string in the test, suffix _1 is not needed. 10. ``` +# Simulate standby connection failure by modifying pg_hba.conf +unlink($primary->data_dir . '/pg_hba.conf'); +$primary->append_conf('pg_hba.conf', + qq{local all all trust} +); ``` What if the system does not have Unix domain socket? I'm afraid all connections could be brocked in this case. 11. ``` +# Attempt to sync replication slots while standby is behind +$standby1->psql('postgres', "SELECT pg_sync_replication_slots();"); ``` When the command can be failed, can you set a error message from the command to a variable? Otherwise an ERROR is output to the result file, which is suprising. ``` psql:<stdin>:1: ERROR: skipping slot synchronization because the received slot sync LSN 0/03019E58 for slot "slot_sync"is ahead of the standby position 0/030000D8 [17:23:21.495](0.384s) ok 2 - slot sync skip when standby is behind ``` 12. ``` +# Restore connectivity between primary and standby +unlink($primary->data_dir . '/pg_hba.conf'); +$primary->append_conf( + 'pg_hba.conf', + qq{ +local all all trust +local replication all trust +}); ``` Same as 10. Also, no need to do unlink() here. 13. ``` +# Create a new logical slot for testing injection point +$primary->safe_psql('postgres', + "SELECT pg_create_logical_replication_slot('slot_sync', 'test_decoding', false, false, true)" +); ``` Before here can you add a description what you would test? 14. ``` # Create a physical replication slot on primary for standby $primary->psql('postgres', q{SELECT pg_create_physical_replication_slot('sb1_slot');}); ``` Use safe_psql instead of psql(). 15. ``` - if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) + if (slot || (slot = SearchNamedReplicationSlot(remote_slot->name, true))) ``` Is there a possibility that slot is not NULL? It is used when confirmed_flush exceeds the latestFlushPtr, but in this case the function cannot reach here. Best regards, Hayato Kuroda FUJITSU LIMITED
Dear Shlok, > Thanks for updating the patch. Here are my comments. I found one more comment. ``` + /* + * If found_consistent_snapshot is not NULL and a consistent snapshot is + * found set the slot sync skip reason to none. Else, if consistent + * snapshot is not found the stats will be updated in the function + * update_and_persist_local_synced_slot + */ + if (!found_consistent_snapshot || *found_consistent_snapshot) + update_slot_sync_skip_stats(slot, SS_SKIP_NONE); ``` I think the condition is confusing; in code level there is a path that found_consistent_snapshot is NULL but synchronization happened. (Not sure it is possible though). I think it is better to put update_slot_sync_skip_stats() near the sync part. If the snapshot exists from the beginning, it can be done unconditionally, otherwise we can check again. Attached .diffs file implements it. Best regards, Hayato Kuroda FUJITSU LIMITED