Обсуждение: Add WALRCV_CONNECTING state to walreceiver

Поиск
Список
Период
Сортировка

Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi Hackers,

Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming'
does not accurately reflect streaming health.  In that discussion,
Noah noted that even before the reported regression, status =
'streaming' was unreliable because walreceiver sets it during early
startup, before attempting a connection. He suggested:

"Long-term, in master only, perhaps we should introduce another status
like 'connecting'. Perhaps enact the connecting->streaming status
transition just before tendering the first byte of streamed WAL to the
startup process. Alternatively, enact that transition when the startup
process accepts the
first streamed byte."

Michael and I also thought this could be a useful addition. This patch
implements that suggestion by adding a new WALRCV_CONNECTING state.

== Background ==
Currently, walreceiver transitions directly from STARTING to STREAMING
early in WalReceiverMain(), before any WAL data has been received.
This means status = 'streaming' can be observed even when:

- The connection to the primary has not been established
- No WAL data has actually been received or flushed

This makes it difficult for monitoring tools to distinguish between a
healthy streaming replica and one that is merely attempting to stream.

== Proposal ==

Introduce WALRCV_CONNECTING as an intermediate state between STARTING
and STREAMING:

- When walreceiver starts, it enters CONNECTING (instead of going
directly to STREAMING).
- The transition to STREAMING occurs in XLogWalRcvFlush(), inside the
existing spinlock-protected block that updates flushedUpto.

Feedbacks welcome.

[1] https://www.postgresql.org/message-id/flat/19093-c4fff49a608f82a0%40postgresql.org

--
Best,
Xuneng

Вложения

Re: Add WALRCV_CONNECTING state to walreceiver

От
Noah Misch
Дата:
On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote:
> Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming'
> does not accurately reflect streaming health.  In that discussion,
> Noah noted that even before the reported regression, status =
> 'streaming' was unreliable because walreceiver sets it during early
> startup, before attempting a connection. He suggested:
> 
> "Long-term, in master only, perhaps we should introduce another status
> like 'connecting'. Perhaps enact the connecting->streaming status
> transition just before tendering the first byte of streamed WAL to the
> startup process. Alternatively, enact that transition when the startup
> process accepts the
> first streamed byte."

> == Proposal ==
> 
> Introduce WALRCV_CONNECTING as an intermediate state between STARTING
> and STREAMING:
> 
> - When walreceiver starts, it enters CONNECTING (instead of going
> directly to STREAMING).
> - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the
> existing spinlock-protected block that updates flushedUpto.

I think this has the drawback that if the primary's WAL is incompatible,
e.g. unacceptable timeline, the walreceiver will still briefly enter
STREAMING.  That could trick monitoring.  Waiting for applyPtr to advance
would avoid the short-lived STREAMING.  What's the feasibility of that?



Re: Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi Noah,

On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote:
> > Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming'
> > does not accurately reflect streaming health.  In that discussion,
> > Noah noted that even before the reported regression, status =
> > 'streaming' was unreliable because walreceiver sets it during early
> > startup, before attempting a connection. He suggested:
> >
> > "Long-term, in master only, perhaps we should introduce another status
> > like 'connecting'. Perhaps enact the connecting->streaming status
> > transition just before tendering the first byte of streamed WAL to the
> > startup process. Alternatively, enact that transition when the startup
> > process accepts the
> > first streamed byte."
>
> > == Proposal ==
> >
> > Introduce WALRCV_CONNECTING as an intermediate state between STARTING
> > and STREAMING:
> >
> > - When walreceiver starts, it enters CONNECTING (instead of going
> > directly to STREAMING).
> > - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the
> > existing spinlock-protected block that updates flushedUpto.
>
> I think this has the drawback that if the primary's WAL is incompatible,
> e.g. unacceptable timeline, the walreceiver will still briefly enter
> STREAMING.  That could trick monitoring.

Thanks for pointing this out.

 Waiting for applyPtr to advance
> would avoid the short-lived STREAMING.  What's the feasibility of that?

I think this could work, but with complications. If replay latency is
high or replay is paused with pg_wal_replay_pause, the WalReceiver
would stay in the CONNECTING state longer than expected. Whether this
is ok depends on the definition of the 'connecting' state. For the
implementation, deciding where and when to check applyPtr against LSNs
like receiveStart is more difficult—the WalReceiver doesn't know when
applyPtr advances. While the WalReceiver can read applyPtr from shared
memory, it isn't automatically notified when that pointer advances.
This leads to latency between checking and replay if this is done in
the WalReceiver part unless we let the startup process set the state,
which would couple the two components. Am I missing something here?

--
Best,
Xuneng



Re: Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi,

On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi Noah,
>
> On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote:
> >
> > On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote:
> > > Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming'
> > > does not accurately reflect streaming health.  In that discussion,
> > > Noah noted that even before the reported regression, status =
> > > 'streaming' was unreliable because walreceiver sets it during early
> > > startup, before attempting a connection. He suggested:
> > >
> > > "Long-term, in master only, perhaps we should introduce another status
> > > like 'connecting'. Perhaps enact the connecting->streaming status
> > > transition just before tendering the first byte of streamed WAL to the
> > > startup process. Alternatively, enact that transition when the startup
> > > process accepts the
> > > first streamed byte."
> >
> > > == Proposal ==
> > >
> > > Introduce WALRCV_CONNECTING as an intermediate state between STARTING
> > > and STREAMING:
> > >
> > > - When walreceiver starts, it enters CONNECTING (instead of going
> > > directly to STREAMING).
> > > - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the
> > > existing spinlock-protected block that updates flushedUpto.
> >
> > I think this has the drawback that if the primary's WAL is incompatible,
> > e.g. unacceptable timeline, the walreceiver will still briefly enter
> > STREAMING.  That could trick monitoring.
>
> Thanks for pointing this out.
>
>  Waiting for applyPtr to advance
> > would avoid the short-lived STREAMING.  What's the feasibility of that?
>
> I think this could work, but with complications. If replay latency is
> high or replay is paused with pg_wal_replay_pause, the WalReceiver
> would stay in the CONNECTING state longer than expected. Whether this
> is ok depends on the definition of the 'connecting' state. For the
> implementation, deciding where and when to check applyPtr against LSNs
> like receiveStart is more difficult—the WalReceiver doesn't know when
> applyPtr advances. While the WalReceiver can read applyPtr from shared
> memory, it isn't automatically notified when that pointer advances.
> This leads to latency between checking and replay if this is done in
> the WalReceiver part unless we let the startup process set the state,
> which would couple the two components. Am I missing something here?
>

After some thoughts, a potential approach could be to expose a new
function in the WAL receiver that transitions the state from
CONNECTING to STREAMING. This function can then be invoked directly
from WaitForWALToBecomeAvailable in the startup process, ensuring the
state change aligns with the actual acceptance of the WAL stream.

--
Best,
Xuneng



Re: Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi,

On Fri, Dec 12, 2025 at 9:52 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi,
>
> On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> >
> > Hi Noah,
> >
> > On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote:
> > >
> > > On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote:
> > > > Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming'
> > > > does not accurately reflect streaming health.  In that discussion,
> > > > Noah noted that even before the reported regression, status =
> > > > 'streaming' was unreliable because walreceiver sets it during early
> > > > startup, before attempting a connection. He suggested:
> > > >
> > > > "Long-term, in master only, perhaps we should introduce another status
> > > > like 'connecting'. Perhaps enact the connecting->streaming status
> > > > transition just before tendering the first byte of streamed WAL to the
> > > > startup process. Alternatively, enact that transition when the startup
> > > > process accepts the
> > > > first streamed byte."
> > >
> > > > == Proposal ==
> > > >
> > > > Introduce WALRCV_CONNECTING as an intermediate state between STARTING
> > > > and STREAMING:
> > > >
> > > > - When walreceiver starts, it enters CONNECTING (instead of going
> > > > directly to STREAMING).
> > > > - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the
> > > > existing spinlock-protected block that updates flushedUpto.
> > >
> > > I think this has the drawback that if the primary's WAL is incompatible,
> > > e.g. unacceptable timeline, the walreceiver will still briefly enter
> > > STREAMING.  That could trick monitoring.
> >
> > Thanks for pointing this out.
> >
> >  Waiting for applyPtr to advance
> > > would avoid the short-lived STREAMING.  What's the feasibility of that?
> >
> > I think this could work, but with complications. If replay latency is
> > high or replay is paused with pg_wal_replay_pause, the WalReceiver
> > would stay in the CONNECTING state longer than expected. Whether this
> > is ok depends on the definition of the 'connecting' state. For the
> > implementation, deciding where and when to check applyPtr against LSNs
> > like receiveStart is more difficult—the WalReceiver doesn't know when
> > applyPtr advances. While the WalReceiver can read applyPtr from shared
> > memory, it isn't automatically notified when that pointer advances.
> > This leads to latency between checking and replay if this is done in
> > the WalReceiver part unless we let the startup process set the state,
> > which would couple the two components. Am I missing something here?
> >
>
> After some thoughts, a potential approach could be to expose a new
> function in the WAL receiver that transitions the state from
> CONNECTING to STREAMING. This function can then be invoked directly
> from WaitForWALToBecomeAvailable in the startup process, ensuring the
> state change aligns with the actual acceptance of the WAL stream.
>

V2 makes the transition from WALRCV_CONNECTING to STREAMING only when
the first valid WAL record is processed by the startup process. A new
function WalRcvSetStreaming is introduced to enable the transition.

--
Best,
Xuneng

Вложения

Re: Add WALRCV_CONNECTING state to walreceiver

От
Noah Misch
Дата:
On Sun, Dec 14, 2025 at 12:45:46PM +0800, Xuneng Zhou wrote:
> On Fri, Dec 12, 2025 at 9:52 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > > On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote:
> > > > Waiting for applyPtr to advance
> > > > would avoid the short-lived STREAMING.  What's the feasibility of that?
> > >
> > > I think this could work, but with complications. If replay latency is
> > > high or replay is paused with pg_wal_replay_pause, the WalReceiver
> > > would stay in the CONNECTING state longer than expected. Whether this
> > > is ok depends on the definition of the 'connecting' state. For the
> > > implementation, deciding where and when to check applyPtr against LSNs
> > > like receiveStart is more difficult—the WalReceiver doesn't know when
> > > applyPtr advances. While the WalReceiver can read applyPtr from shared
> > > memory, it isn't automatically notified when that pointer advances.
> > > This leads to latency between checking and replay if this is done in
> > > the WalReceiver part unless we let the startup process set the state,
> > > which would couple the two components. Am I missing something here?
> >
> > After some thoughts, a potential approach could be to expose a new
> > function in the WAL receiver that transitions the state from
> > CONNECTING to STREAMING. This function can then be invoked directly
> > from WaitForWALToBecomeAvailable in the startup process, ensuring the
> > state change aligns with the actual acceptance of the WAL stream.
> 
> V2 makes the transition from WALRCV_CONNECTING to STREAMING only when
> the first valid WAL record is processed by the startup process. A new
> function WalRcvSetStreaming is introduced to enable the transition.

The original patch set STREAMING in XLogWalRcvFlush().  XLogWalRcvFlush()
callee XLogWalRcvSendReply() already fetches applyPtr to send a status
message.  So I would try the following before involving the startup process
like v2 does:

1. store the applyPtr when we enter CONNECTING
2. force a status message as long as we remain in CONNECTING
3. become STREAMING when applyPtr differs from the one stored at (1)

A possible issue with all patch versions: when the primary is writing no WAL
and the standby was caught up before this walreceiver started, CONNECTING
could persist for an unbounded amount of time.  Only actual primary WAL
generation would move the walreceiver to STREAMING.  This relates to your
above point about high latency.  If that's a concern, perhaps this change
deserves a total of two new states, CONNECTING and a state that represents
"connection exists, no WAL yet applied"?



Re: Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi,

On Sun, Dec 14, 2025 at 1:14 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Sun, Dec 14, 2025 at 12:45:46PM +0800, Xuneng Zhou wrote:
> > On Fri, Dec 12, 2025 at 9:52 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > > On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > > > On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > Waiting for applyPtr to advance
> > > > > would avoid the short-lived STREAMING.  What's the feasibility of that?
> > > >
> > > > I think this could work, but with complications. If replay latency is
> > > > high or replay is paused with pg_wal_replay_pause, the WalReceiver
> > > > would stay in the CONNECTING state longer than expected. Whether this
> > > > is ok depends on the definition of the 'connecting' state. For the
> > > > implementation, deciding where and when to check applyPtr against LSNs
> > > > like receiveStart is more difficult—the WalReceiver doesn't know when
> > > > applyPtr advances. While the WalReceiver can read applyPtr from shared
> > > > memory, it isn't automatically notified when that pointer advances.
> > > > This leads to latency between checking and replay if this is done in
> > > > the WalReceiver part unless we let the startup process set the state,
> > > > which would couple the two components. Am I missing something here?
> > >
> > > After some thoughts, a potential approach could be to expose a new
> > > function in the WAL receiver that transitions the state from
> > > CONNECTING to STREAMING. This function can then be invoked directly
> > > from WaitForWALToBecomeAvailable in the startup process, ensuring the
> > > state change aligns with the actual acceptance of the WAL stream.
> >
> > V2 makes the transition from WALRCV_CONNECTING to STREAMING only when
> > the first valid WAL record is processed by the startup process. A new
> > function WalRcvSetStreaming is introduced to enable the transition.
>
> The original patch set STREAMING in XLogWalRcvFlush().  XLogWalRcvFlush()
> callee XLogWalRcvSendReply() already fetches applyPtr to send a status
> message.  So I would try the following before involving the startup process
> like v2 does:
>
> 1. store the applyPtr when we enter CONNECTING
> 2. force a status message as long as we remain in CONNECTING
> 3. become STREAMING when applyPtr differs from the one stored at (1)

Thanks for the suggestion. Using XLogWalRcvSendReply() for the
transition could make sense. My concern before is about latency in a
rare case: if the first flush completes but applyPtr hasn't advanced
yet at the time of check and then the flush stalls after that, we
might wait up to wal_receiver_status_interval (default 10s) before the
next check or indefinitely if (wal_receiver_status_interval <= 0).
This could be mitigated by shortening the wakeup interval while in
CONNECTING (step 2), which reduces worst-case latency to ~1 second.
Given that monitoring typically doesn't require sub-second precision,
this approach could be feasible.

case WALRCV_WAKEUP_REPLY:
if (WalRcv->walRcvState == WALRCV_CONNECTING)
{
/* Poll frequently while CONNECTING to avoid long latency */
wakeup[reason] = TimestampTzPlusMilliseconds(now, 1000);
}

> A possible issue with all patch versions: when the primary is writing no WAL
> and the standby was caught up before this walreceiver started, CONNECTING
> could persist for an unbounded amount of time.  Only actual primary WAL
> generation would move the walreceiver to STREAMING.  This relates to your
> above point about high latency.  If that's a concern, perhaps this change
> deserves a total of two new states, CONNECTING and a state that represents
> "connection exists, no WAL yet applied"?

Yes, this could be an issue. Using two states would help address it.
That said, when the primary is idle in this case, we might end up
repeatedly polling the apply status in the state before streaming if
we implement the 1s short-interval checking like above, which could be
costful. However, If we do not implement it &&
wal_receiver_status_interval is set to < 0 && flush stalls, the
walreceiver could stay in the pre-streaming state indefinitely even if
streaming did occur, which violates the semantics. Do you think this
is a valid concern or just an artificial edge case?

--
Best,
Xuneng



Re: Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi,


On Sun, Dec 14, 2025 at 4:55 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi,
>
> On Sun, Dec 14, 2025 at 1:14 PM Noah Misch <noah@leadboat.com> wrote:
> >
> > On Sun, Dec 14, 2025 at 12:45:46PM +0800, Xuneng Zhou wrote:
> > > On Fri, Dec 12, 2025 at 9:52 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > > > On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > > > > On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > > Waiting for applyPtr to advance
> > > > > > would avoid the short-lived STREAMING.  What's the feasibility of that?
> > > > >
> > > > > I think this could work, but with complications. If replay latency is
> > > > > high or replay is paused with pg_wal_replay_pause, the WalReceiver
> > > > > would stay in the CONNECTING state longer than expected. Whether this
> > > > > is ok depends on the definition of the 'connecting' state. For the
> > > > > implementation, deciding where and when to check applyPtr against LSNs
> > > > > like receiveStart is more difficult—the WalReceiver doesn't know when
> > > > > applyPtr advances. While the WalReceiver can read applyPtr from shared
> > > > > memory, it isn't automatically notified when that pointer advances.
> > > > > This leads to latency between checking and replay if this is done in
> > > > > the WalReceiver part unless we let the startup process set the state,
> > > > > which would couple the two components. Am I missing something here?
> > > >
> > > > After some thoughts, a potential approach could be to expose a new
> > > > function in the WAL receiver that transitions the state from
> > > > CONNECTING to STREAMING. This function can then be invoked directly
> > > > from WaitForWALToBecomeAvailable in the startup process, ensuring the
> > > > state change aligns with the actual acceptance of the WAL stream.
> > >
> > > V2 makes the transition from WALRCV_CONNECTING to STREAMING only when
> > > the first valid WAL record is processed by the startup process. A new
> > > function WalRcvSetStreaming is introduced to enable the transition.
> >
> > The original patch set STREAMING in XLogWalRcvFlush().  XLogWalRcvFlush()
> > callee XLogWalRcvSendReply() already fetches applyPtr to send a status
> > message.  So I would try the following before involving the startup process
> > like v2 does:
> >
> > 1. store the applyPtr when we enter CONNECTING
> > 2. force a status message as long as we remain in CONNECTING
> > 3. become STREAMING when applyPtr differs from the one stored at (1)
>
> Thanks for the suggestion. Using XLogWalRcvSendReply() for the
> transition could make sense. My concern before is about latency in a
> rare case: if the first flush completes but applyPtr hasn't advanced
> yet at the time of check and then the flush stalls after that, we
> might wait up to wal_receiver_status_interval (default 10s) before the
> next check or indefinitely if (wal_receiver_status_interval <= 0).
> This could be mitigated by shortening the wakeup interval while in
> CONNECTING (step 2), which reduces worst-case latency to ~1 second.
> Given that monitoring typically doesn't require sub-second precision,
> this approach could be feasible.
>
> case WALRCV_WAKEUP_REPLY:
> if (WalRcv->walRcvState == WALRCV_CONNECTING)
> {
> /* Poll frequently while CONNECTING to avoid long latency */
> wakeup[reason] = TimestampTzPlusMilliseconds(now, 1000);
> }
>
> > A possible issue with all patch versions: when the primary is writing no WAL
> > and the standby was caught up before this walreceiver started, CONNECTING
> > could persist for an unbounded amount of time.  Only actual primary WAL
> > generation would move the walreceiver to STREAMING.  This relates to your
> > above point about high latency.  If that's a concern, perhaps this change
> > deserves a total of two new states, CONNECTING and a state that represents
> > "connection exists, no WAL yet applied"?
>
> Yes, this could be an issue. Using two states would help address it.
> That said, when the primary is idle in this case, we might end up
> repeatedly polling the apply status in the state before streaming if
> we implement the 1s short-interval checking like above, which could be
> costful. However, If we do not implement it &&
> wal_receiver_status_interval is set to < 0 && flush stalls, the
> walreceiver could stay in the pre-streaming state indefinitely even if
> streaming did occur, which violates the semantics. Do you think this
> is a valid concern or just an artificial edge case?

After looking more closely, I found that true indefinite waiting
requires ALL of:

wal_receiver_status_interval <= 0 (disables status updates)
wal_receiver_timeout <= 0
Primary sends no keepalives
No more WAL arrives after the first failed-check flush
Startup never sets force_reply

which is quite impossible and artificial, sorry for the noise here.
The worst-case latency of state-transition in the scenario described
above would be max(Primary keepalive, REPLY timeout, PING timeout),
which might be ok without the short-interval mitigation, given this
case is pretty rare. I plan to implement the following approach with
two new states like you suggested as v3.

1. enter CONNECTING
2. transite the state to CONNECTED/IDLE when START_REPLICATION
succeeds, store the applyPtr
2. force a status message in XLogWalRcvFlush  as long as we remain in
CONNECTED/IDLE
3. become STREAMING when applyPtr differs from the one stored at (2)

--
Best,
Xuneng



Re: Add WALRCV_CONNECTING state to walreceiver

От
Noah Misch
Дата:
On Sun, Dec 14, 2025 at 06:17:34PM +0800, Xuneng Zhou wrote:
> On Sun, Dec 14, 2025 at 4:55 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > On Sun, Dec 14, 2025 at 1:14 PM Noah Misch <noah@leadboat.com> wrote:
> > > > V2 makes the transition from WALRCV_CONNECTING to STREAMING only when
> > > > the first valid WAL record is processed by the startup process. A new
> > > > function WalRcvSetStreaming is introduced to enable the transition.
> > >
> > > The original patch set STREAMING in XLogWalRcvFlush().  XLogWalRcvFlush()
> > > callee XLogWalRcvSendReply() already fetches applyPtr to send a status
> > > message.  So I would try the following before involving the startup process
> > > like v2 does:
> > >
> > > 1. store the applyPtr when we enter CONNECTING
> > > 2. force a status message as long as we remain in CONNECTING
> > > 3. become STREAMING when applyPtr differs from the one stored at (1)
> >
> > Thanks for the suggestion. Using XLogWalRcvSendReply() for the
> > transition could make sense. My concern before is about latency in a
> > rare case: if the first flush completes but applyPtr hasn't advanced
> > yet at the time of check and then the flush stalls after that, we
> > might wait up to wal_receiver_status_interval (default 10s) before the
> > next check or indefinitely if (wal_receiver_status_interval <= 0).
> > This could be mitigated by shortening the wakeup interval while in
> > CONNECTING (step 2), which reduces worst-case latency to ~1 second.
> > Given that monitoring typically doesn't require sub-second precision,
> > this approach could be feasible.
> >
> > case WALRCV_WAKEUP_REPLY:
> > if (WalRcv->walRcvState == WALRCV_CONNECTING)
> > {
> > /* Poll frequently while CONNECTING to avoid long latency */
> > wakeup[reason] = TimestampTzPlusMilliseconds(now, 1000);
> > }
> >
> > > A possible issue with all patch versions: when the primary is writing no WAL
> > > and the standby was caught up before this walreceiver started, CONNECTING
> > > could persist for an unbounded amount of time.  Only actual primary WAL
> > > generation would move the walreceiver to STREAMING.  This relates to your
> > > above point about high latency.  If that's a concern, perhaps this change
> > > deserves a total of two new states, CONNECTING and a state that represents
> > > "connection exists, no WAL yet applied"?
> >
> > Yes, this could be an issue. Using two states would help address it.
> > That said, when the primary is idle in this case, we might end up
> > repeatedly polling the apply status in the state before streaming if
> > we implement the 1s short-interval checking like above, which could be
> > costful. However, If we do not implement it &&
> > wal_receiver_status_interval is set to < 0 && flush stalls, the
> > walreceiver could stay in the pre-streaming state indefinitely even if
> > streaming did occur, which violates the semantics. Do you think this
> > is a valid concern or just an artificial edge case?
> 
> After looking more closely, I found that true indefinite waiting
> requires ALL of:
> 
> wal_receiver_status_interval <= 0 (disables status updates)
> wal_receiver_timeout <= 0
> Primary sends no keepalives
> No more WAL arrives after the first failed-check flush
> Startup never sets force_reply
> 
> which is quite impossible and artificial, sorry for the noise here.

Even if indefinite wait is a negligible concern, you identified a lot of
intricacy that I hadn't pictured.  That makes your startup-process-driven
version potentially more attractive.  Forcing status messages like I was
thinking may also yield an unwanted flurry of them if the startup process is
slow.  Let's see what the patch reviewer thinks.



Re: Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi,

On Mon, Dec 15, 2025 at 12:14 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Sun, Dec 14, 2025 at 06:17:34PM +0800, Xuneng Zhou wrote:
> > On Sun, Dec 14, 2025 at 4:55 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > > On Sun, Dec 14, 2025 at 1:14 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > V2 makes the transition from WALRCV_CONNECTING to STREAMING only when
> > > > > the first valid WAL record is processed by the startup process. A new
> > > > > function WalRcvSetStreaming is introduced to enable the transition.
> > > >
> > > > The original patch set STREAMING in XLogWalRcvFlush().  XLogWalRcvFlush()
> > > > callee XLogWalRcvSendReply() already fetches applyPtr to send a status
> > > > message.  So I would try the following before involving the startup process
> > > > like v2 does:
> > > >
> > > > 1. store the applyPtr when we enter CONNECTING
> > > > 2. force a status message as long as we remain in CONNECTING
> > > > 3. become STREAMING when applyPtr differs from the one stored at (1)
> > >
> > > Thanks for the suggestion. Using XLogWalRcvSendReply() for the
> > > transition could make sense. My concern before is about latency in a
> > > rare case: if the first flush completes but applyPtr hasn't advanced
> > > yet at the time of check and then the flush stalls after that, we
> > > might wait up to wal_receiver_status_interval (default 10s) before the
> > > next check or indefinitely if (wal_receiver_status_interval <= 0).
> > > This could be mitigated by shortening the wakeup interval while in
> > > CONNECTING (step 2), which reduces worst-case latency to ~1 second.
> > > Given that monitoring typically doesn't require sub-second precision,
> > > this approach could be feasible.
> > >
> > > case WALRCV_WAKEUP_REPLY:
> > > if (WalRcv->walRcvState == WALRCV_CONNECTING)
> > > {
> > > /* Poll frequently while CONNECTING to avoid long latency */
> > > wakeup[reason] = TimestampTzPlusMilliseconds(now, 1000);
> > > }
> > >
> > > > A possible issue with all patch versions: when the primary is writing no WAL
> > > > and the standby was caught up before this walreceiver started, CONNECTING
> > > > could persist for an unbounded amount of time.  Only actual primary WAL
> > > > generation would move the walreceiver to STREAMING.  This relates to your
> > > > above point about high latency.  If that's a concern, perhaps this change
> > > > deserves a total of two new states, CONNECTING and a state that represents
> > > > "connection exists, no WAL yet applied"?
> > >
> > > Yes, this could be an issue. Using two states would help address it.
> > > That said, when the primary is idle in this case, we might end up
> > > repeatedly polling the apply status in the state before streaming if
> > > we implement the 1s short-interval checking like above, which could be
> > > costful. However, If we do not implement it &&
> > > wal_receiver_status_interval is set to < 0 && flush stalls, the
> > > walreceiver could stay in the pre-streaming state indefinitely even if
> > > streaming did occur, which violates the semantics. Do you think this
> > > is a valid concern or just an artificial edge case?
> >
> > After looking more closely, I found that true indefinite waiting
> > requires ALL of:
> >
> > wal_receiver_status_interval <= 0 (disables status updates)
> > wal_receiver_timeout <= 0
> > Primary sends no keepalives
> > No more WAL arrives after the first failed-check flush
> > Startup never sets force_reply
> >
> > which is quite impossible and artificial, sorry for the noise here.
>
> Even if indefinite wait is a negligible concern, you identified a lot of
> intricacy that I hadn't pictured.  That makes your startup-process-driven
> version potentially more attractive.  Forcing status messages like I was
> thinking may also yield an unwanted flurry of them if the startup process is
> slow.  Let's see what the patch reviewer thinks.

OK, both approaches are presented for review.  Adding two states to
avoid the confusion of the status caused by the stall you depicted
earlier seems reasonable to me. So, I adapted it in v3.


--
Best,
Xuneng

Вложения

Re: Add WALRCV_CONNECTING state to walreceiver

От
Rahila Syed
Дата:

Hi,

On Mon, Dec 15, 2025 at 9:44 AM Noah Misch <noah@leadboat.com> wrote:
On Sun, Dec 14, 2025 at 06:17:34PM +0800, Xuneng Zhou wrote:
> On Sun, Dec 14, 2025 at 4:55 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > On Sun, Dec 14, 2025 at 1:14 PM Noah Misch <noah@leadboat.com> wrote:
> > > > V2 makes the transition from WALRCV_CONNECTING to STREAMING only when
> > > > the first valid WAL record is processed by the startup process. A new
> > > > function WalRcvSetStreaming is introduced to enable the transition.
> > >
> > > The original patch set STREAMING in XLogWalRcvFlush().  XLogWalRcvFlush()
> > > callee XLogWalRcvSendReply() already fetches applyPtr to send a status
> > > message.  So I would try the following before involving the startup process
> > > like v2 does:
> > >
> > > 1. store the applyPtr when we enter CONNECTING
> > > 2. force a status message as long as we remain in CONNECTING
> > > 3. become STREAMING when applyPtr differs from the one stored at (1)
> >
> > Thanks for the suggestion. Using XLogWalRcvSendReply() for the
> > transition could make sense. My concern before is about latency in a
> > rare case: if the first flush completes but applyPtr hasn't advanced
> > yet at the time of check and then the flush stalls after that, we
> > might wait up to wal_receiver_status_interval (default 10s) before the
> > next check or indefinitely if (wal_receiver_status_interval <= 0).
> > This could be mitigated by shortening the wakeup interval while in
> > CONNECTING (step 2), which reduces worst-case latency to ~1 second.
> > Given that monitoring typically doesn't require sub-second precision,
> > this approach could be feasible.
> >
> > case WALRCV_WAKEUP_REPLY:
> > if (WalRcv->walRcvState == WALRCV_CONNECTING)
> > {
> > /* Poll frequently while CONNECTING to avoid long latency */
> > wakeup[reason] = TimestampTzPlusMilliseconds(now, 1000);
> > }
> >
> > > A possible issue with all patch versions: when the primary is writing no WAL
> > > and the standby was caught up before this walreceiver started, CONNECTING
> > > could persist for an unbounded amount of time.  Only actual primary WAL
> > > generation would move the walreceiver to STREAMING.  This relates to your
> > > above point about high latency.  If that's a concern, perhaps this change
> > > deserves a total of two new states, CONNECTING and a state that represents
> > > "connection exists, no WAL yet applied"?
> >
> > Yes, this could be an issue. Using two states would help address it.
> > That said, when the primary is idle in this case, we might end up
> > repeatedly polling the apply status in the state before streaming if
> > we implement the 1s short-interval checking like above, which could be
> > costful. However, If we do not implement it &&
> > wal_receiver_status_interval is set to < 0 && flush stalls, the
> > walreceiver could stay in the pre-streaming state indefinitely even if
> > streaming did occur, which violates the semantics. Do you think this
> > is a valid concern or just an artificial edge case?
>
> After looking more closely, I found that true indefinite waiting
> requires ALL of:
>
> wal_receiver_status_interval <= 0 (disables status updates)
> wal_receiver_timeout <= 0
> Primary sends no keepalives
> No more WAL arrives after the first failed-check flush
> Startup never sets force_reply
>
> which is quite impossible and artificial, sorry for the noise here.

Even if indefinite wait is a negligible concern, you identified a lot of
intricacy that I hadn't pictured.  That makes your startup-process-driven
version potentially more attractive.  Forcing status messages like I was
thinking may also yield an unwanted flurry of them if the startup process is
slow.  Let's see what the patch reviewer thinks.

FWIW, I think doing it in startup might be slightly better.
It seems more logical to make the state change near the point where the status
is updated, as this helps prevent reading the status from shared memory and
reduces related delays.

The current proposal is to advance the state to STREAMING after applyPtr has
been updated.
IIUC, the rationale is to avoid having a short-lived streaming state if applying WAL fails.
However, this approach can be confusing because the receiver may already be receiving
WAL from the primary, yet its state remains CONNECTING until the WAL is flushed.

Would it be better to advance the state to streaming after the connection
is successfully established and the following LOG message is emitted?
    
        if (walrcv_startstreaming(wrconn, &options))
        {
            if (first_stream)
                ereport(LOG,
                        errmsg("started streaming WAL from primary at %X/%08X on timeline %u",
                               LSN_FORMAT_ARGS(startpoint), startpointTLI));

Thank you,
Rahila Syed
            

Re: Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi Rahila,

Thanks for looking into this.

On Mon, Dec 15, 2025 at 9:48 PM Rahila Syed <rahilasyed90@gmail.com> wrote:
>
>
> Hi,
>
> On Mon, Dec 15, 2025 at 9:44 AM Noah Misch <noah@leadboat.com> wrote:
>>
>> On Sun, Dec 14, 2025 at 06:17:34PM +0800, Xuneng Zhou wrote:
>> > On Sun, Dec 14, 2025 at 4:55 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>> > > On Sun, Dec 14, 2025 at 1:14 PM Noah Misch <noah@leadboat.com> wrote:
>> > > > > V2 makes the transition from WALRCV_CONNECTING to STREAMING only when
>> > > > > the first valid WAL record is processed by the startup process. A new
>> > > > > function WalRcvSetStreaming is introduced to enable the transition.
>> > > >
>> > > > The original patch set STREAMING in XLogWalRcvFlush().  XLogWalRcvFlush()
>> > > > callee XLogWalRcvSendReply() already fetches applyPtr to send a status
>> > > > message.  So I would try the following before involving the startup process
>> > > > like v2 does:
>> > > >
>> > > > 1. store the applyPtr when we enter CONNECTING
>> > > > 2. force a status message as long as we remain in CONNECTING
>> > > > 3. become STREAMING when applyPtr differs from the one stored at (1)
>> > >
>> > > Thanks for the suggestion. Using XLogWalRcvSendReply() for the
>> > > transition could make sense. My concern before is about latency in a
>> > > rare case: if the first flush completes but applyPtr hasn't advanced
>> > > yet at the time of check and then the flush stalls after that, we
>> > > might wait up to wal_receiver_status_interval (default 10s) before the
>> > > next check or indefinitely if (wal_receiver_status_interval <= 0).
>> > > This could be mitigated by shortening the wakeup interval while in
>> > > CONNECTING (step 2), which reduces worst-case latency to ~1 second.
>> > > Given that monitoring typically doesn't require sub-second precision,
>> > > this approach could be feasible.
>> > >
>> > > case WALRCV_WAKEUP_REPLY:
>> > > if (WalRcv->walRcvState == WALRCV_CONNECTING)
>> > > {
>> > > /* Poll frequently while CONNECTING to avoid long latency */
>> > > wakeup[reason] = TimestampTzPlusMilliseconds(now, 1000);
>> > > }
>> > >
>> > > > A possible issue with all patch versions: when the primary is writing no WAL
>> > > > and the standby was caught up before this walreceiver started, CONNECTING
>> > > > could persist for an unbounded amount of time.  Only actual primary WAL
>> > > > generation would move the walreceiver to STREAMING.  This relates to your
>> > > > above point about high latency.  If that's a concern, perhaps this change
>> > > > deserves a total of two new states, CONNECTING and a state that represents
>> > > > "connection exists, no WAL yet applied"?
>> > >
>> > > Yes, this could be an issue. Using two states would help address it.
>> > > That said, when the primary is idle in this case, we might end up
>> > > repeatedly polling the apply status in the state before streaming if
>> > > we implement the 1s short-interval checking like above, which could be
>> > > costful. However, If we do not implement it &&
>> > > wal_receiver_status_interval is set to < 0 && flush stalls, the
>> > > walreceiver could stay in the pre-streaming state indefinitely even if
>> > > streaming did occur, which violates the semantics. Do you think this
>> > > is a valid concern or just an artificial edge case?
>> >
>> > After looking more closely, I found that true indefinite waiting
>> > requires ALL of:
>> >
>> > wal_receiver_status_interval <= 0 (disables status updates)
>> > wal_receiver_timeout <= 0
>> > Primary sends no keepalives
>> > No more WAL arrives after the first failed-check flush
>> > Startup never sets force_reply
>> >
>> > which is quite impossible and artificial, sorry for the noise here.
>>
>> Even if indefinite wait is a negligible concern, you identified a lot of
>> intricacy that I hadn't pictured.  That makes your startup-process-driven
>> version potentially more attractive.  Forcing status messages like I was
>> thinking may also yield an unwanted flurry of them if the startup process is
>> slow.  Let's see what the patch reviewer thinks.
>
>
> FWIW, I think doing it in startup might be slightly better.
> It seems more logical to make the state change near the point where the status
> is updated, as this helps prevent reading the status from shared memory and
> reduces related delays.
>
> The current proposal is to advance the state to STREAMING after applyPtr has
> been updated.
> IIUC, the rationale is to avoid having a short-lived streaming state if applying WAL fails.
> However, this approach can be confusing because the receiver may already be receiving
> WAL from the primary, yet its state remains CONNECTING until the WAL is flushed.
>
> Would it be better to advance the state to streaming after the connection
> is successfully established and the following LOG message is emitted?
>
>         if (walrcv_startstreaming(wrconn, &options))
>         {
>             if (first_stream)
>                 ereport(LOG,
>                         errmsg("started streaming WAL from primary at %X/%08X on timeline %u",
>                                LSN_FORMAT_ARGS(startpoint), startpointTLI));

AFAICS, this may depend on how we define the streaming status. If
streaming is defined simply as “the connection has been established
and walreceiver is ready to operate,” then this approach fits well and
keeps the model simple. However, if streaming is meant to indicate
that WAL has actually started flowing and replay is in progress, then
this approach could fall short, particularly for the short-lived
streaming cases you mentioned. Introducing finer-grained states can
handle these edge cases more accurately, but it also makes the state
transitions more complex. That said, I’m not well positioned to fully
evaluate the trade-offs here, as I’m not a day-to-day end user.

--
Best,
Xuneng