Обсуждение: Refactor replication origin state reset helpers

Поиск
Список
Период
Сортировка

Refactor replication origin state reset helpers

От
Chao Li
Дата:
Hi Hacker,

While reviewing patches [1] and [2], I noticed some duplicate code of clearing replication origin states, I am proposing a small patch that removes the duplicate code blocks by introducing a couple helper functions. No functional change at all.


Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/




Вложения

Re: Refactor replication origin state reset helpers

От
Ashutosh Bapat
Дата:
On Wed, Dec 24, 2025 at 6:58 AM Chao Li <li.evan.chao@gmail.com> wrote:
>
> Hi Hacker,
>
> While reviewing patches [1] and [2], I noticed some duplicate code of clearing replication origin states, I am
proposinga small patch that removes the duplicate code blocks by introducing a couple helper functions. No functional
changeat all. 
>

The new functions bring together the global variables that need to be
reset under certain conditions. The functions will help not to miss
resetting some variable. However, this can be a mild backpatching
pain. So, I am +.5 on this.

If we go this route, we at least need to declare the new functions as
static inline and move them to a header file instead of .c file.

Further, does it make sense to put together all the state variables
into a single structure?

It's also quite easy to confuse between these functions and
replorigin_session_reset(). It's not clear where the boundaries of the
latter end and where those of the new ones start. I think the latter
deals with the shared memory structures while the new ones deal with
the backend local state. And then there's replorigin_reset() which
adds to the confusion. That function doesn't call
replorigin_session_reset() which the other two callers of
replorigin_session_clear_state() call. Why? I think there is more to
clean here than what's in the patch. That doesn't mean that we cannot
accept this patch without larger cleanup, but it should not add to the
existing confusion.

--
Best Wishes,
Ashutosh Bapat



Re: Refactor replication origin state reset helpers

От
Chao Li
Дата:
Hi Ashutosh,

Thanks for your review.

On Wed, Dec 24, 2025 at 6:57 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Wed, Dec 24, 2025 at 6:58 AM Chao Li <li.evan.chao@gmail.com> wrote:
>
> Hi Hacker,
>
> While reviewing patches [1] and [2], I noticed some duplicate code of clearing replication origin states, I am proposing a small patch that removes the duplicate code blocks by introducing a couple helper functions. No functional change at all.
>

The new functions bring together the global variables that need to be
reset under certain conditions. The functions will help not to miss
resetting some variable. However, this can be a mild backpatching
pain. So, I am +.5 on this.

If we go this route, we at least need to declare the new functions as
static inline and move them to a header file instead of .c file.

I like the idea of making the helpers static inline and moving them to origin.h to stay close to the 3 global variables, which improves readability as well. See attached v2 for the change.


Further, does it make sense to put together all the state variables
into a single structure?

 I hesitate to go that far, because the 3 global variables are all exported. So, if we really want to do that, that should belong to a dedicated thread.
 
It's also quite easy to confuse between these functions and
replorigin_session_reset(). It's not clear where the boundaries of the
latter end and where those of the new ones start. I think the latter
deals with the shared memory structures while the new ones deal with
the backend local state. And then there's replorigin_reset() which
adds to the confusion. That function doesn't call
replorigin_session_reset() which the other two callers of
replorigin_session_clear_state() call. Why? I think there is more to
clean here than what's in the patch. That doesn't mean that we cannot
accept this patch without larger cleanup, but it should not add to the
existing confusion.

Good point. I am trying to resolve the confusions in v2.

For replorigin_reset(), it's easy to address. As this function is local static and the only purpose is to satisfy the pg_on_exit_callback contract, I just renamed it to on_exit_clear_state() in v2.

For replorigin_session_reset(), there are only 2 callers and both callers call replorigin_session_clear_state() immediately after replorigin_session_reset(). So in v2, I made replorigin_session_reset() to call replorigin_session_clear_state(), and added some comments in replorigin_session_reset(). Accordingly, replorigin_session_reset() is used to reset states set by replorigin_session_setup(), and every call of replorigin_session_setup() is immediately followed by "replorigin_session_origin = originid;", so I moved this assignment into replorigin_session_setup().

With these broader changes, this patch is no longer trivial, so I just added it to CF: 

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
Вложения

Re: Refactor replication origin state reset helpers

От
Chao Li
Дата:

> On Dec 25, 2025, at 13:05, Chao Li <li.evan.chao@gmail.com> wrote:
>
> With these broader changes, this patch is no longer trivial, so I just added it to CF:

https://commitfest.postgresql.org/patch/6345/

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/







Re: Refactor replication origin state reset helpers

От
Álvaro Herrera
Дата:
On 2025-Dec-24, Ashutosh Bapat wrote:

> If we go this route, we at least need to declare the new functions as
> static inline and move them to a header file instead of .c file.

Hmm, why would we make them static inline instead of standard (extern)
functions?  We use static inline functions when we want to avoid the
overhead of a function call in a hot code path, but I doubt that's the
case here.  Am I mistaken on this?

> Further, does it make sense to put together all the state variables
> into a single structure?

Yeah -- keeping the threaded-backend project in mind, moving them to a
single struct seems to make sense.  I think it's a separate patch though
because it'd be more invasive than Chao's initial patch, as those
variables are used in many places.

> It's also quite easy to confuse between these functions and
> replorigin_session_reset(). It's not clear where the boundaries of the
> latter end and where those of the new ones start. I think the latter
> deals with the shared memory structures while the new ones deal with
> the backend local state. And then there's replorigin_reset() which
> adds to the confusion. That function doesn't call
> replorigin_session_reset() which the other two callers of
> replorigin_session_clear_state() call. Why? I think there is more to
> clean here than what's in the patch. That doesn't mean that we cannot
> accept this patch without larger cleanup, but it should not add to the
> existing confusion.

Good points.  A decrease in the total quantity of cruft would be a good
outcome of a patch in this area.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Update: super-fast reaction on the Postgres bugs mailing list. The report
was acknowledged [...], and a fix is under discussion.
The wonders of open-source !"
             https://twitter.com/gunnarmorling/status/1596080409259003906



Re: Refactor replication origin state reset helpers

От
Álvaro Herrera
Дата:
On 2025-Dec-29, Álvaro Herrera wrote:

> On 2025-Dec-24, Ashutosh Bapat wrote:

> > It's also quite easy to confuse between these functions and
> > replorigin_session_reset(). It's not clear where the boundaries of the
> > latter end and where those of the new ones start. I think the latter
> > deals with the shared memory structures while the new ones deal with
> > the backend local state. And then there's replorigin_reset() which
> > adds to the confusion. That function doesn't call
> > replorigin_session_reset() which the other two callers of
> > replorigin_session_clear_state() call. Why? I think there is more to
> > clean here than what's in the patch. That doesn't mean that we cannot
> > accept this patch without larger cleanup, but it should not add to the
> > existing confusion.

I think we should just rename the worker.c function to have a less
generic-sounding name (so that it becomes more obvious that it's a
worker.c-specific function); have both replorigin_session_clear_state()
and replorigin_xact_clear_state() be a single function, with a boolean
flag to indicate whether to reset replorigin_session_origin or not; and
have replorigin_session_reset() call replorigin_session_clear_state()
internally rather than require every single of its callers do it
separately, which IMO makes no sense.  With these three changes I think
the code would become quite clear.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Thou shalt check the array bounds of all strings (indeed, all arrays), for
surely where thou typest "foo" someone someday shall type
"supercalifragilisticexpialidocious" (5th Commandment for C programmers)



Re: Refactor replication origin state reset helpers

От
Ashutosh Bapat
Дата:
On Mon, Dec 29, 2025 at 8:14 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> On 2025-Dec-24, Ashutosh Bapat wrote:
>
> > If we go this route, we at least need to declare the new functions as
> > static inline and move them to a header file instead of .c file.
>
> Hmm, why would we make them static inline instead of standard (extern)
> functions?  We use static inline functions when we want to avoid the
> overhead of a function call in a hot code path, but I doubt that's the
> case here.  Am I mistaken on this?
>

I wasn't aware that we are using static inline only in hot code paths.
Looking around I see most of the static inline functions are from
modules which are used in hot code paths. So, yeah that seems to be
the convention. I also see some exceptions like those in
basebackup_sink.h - I don't think all of those are used in hot code
paths.

In this case, we are moving three assignments into their own
functions. CPU instructions to call extern functions will be
significant compared to CPU instructions for those assignments. static
inline functions, OTOH, would have similar performance as the existing
code while providing modularization. If you feel that's not a good
enough reason, I am ok keeping them extern.

> > Further, does it make sense to put together all the state variables
> > into a single structure?
>
> Yeah -- keeping the threaded-backend project in mind, moving them to a
> single struct seems to make sense.  I think it's a separate patch though
> because it'd be more invasive than Chao's initial patch, as those
> variables are used in many places.
>

Agreed.

--
Best Wishes,
Ashutosh Bapat



Re: Refactor replication origin state reset helpers

От
Chao Li
Дата:

On Tue, Dec 30, 2025 at 12:48 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Mon, Dec 29, 2025 at 8:14 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> On 2025-Dec-24, Ashutosh Bapat wrote:
>
> > If we go this route, we at least need to declare the new functions as
> > static inline and move them to a header file instead of .c file.
>
> Hmm, why would we make them static inline instead of standard (extern)
> functions?  We use static inline functions when we want to avoid the
> overhead of a function call in a hot code path, but I doubt that's the
> case here.  Am I mistaken on this?
>

I wasn't aware that we are using static inline only in hot code paths.
Looking around I see most of the static inline functions are from
modules which are used in hot code paths. So, yeah that seems to be
the convention. I also see some exceptions like those in
basebackup_sink.h - I don't think all of those are used in hot code
paths.

In this case, we are moving three assignments into their own
functions. CPU instructions to call extern functions will be
significant compared to CPU instructions for those assignments. static
inline functions, OTOH, would have similar performance as the existing
code while providing modularization. If you feel that's not a good
enough reason, I am ok keeping them extern.

> > Further, does it make sense to put together all the state variables
> > into a single structure?
>
> Yeah -- keeping the threaded-backend project in mind, moving them to a
> single struct seems to make sense.  I think it's a separate patch though
> because it'd be more invasive than Chao's initial patch, as those
> variables are used in many places.
>

Attached v3 patch set. Comparing to v2, the changes are:

0001:
* Combine the two cleanup functions into one and control them by a bool flag.
* Change the helper function to be extern.
* Move out cleanup from reset function.

0002: Consolidate replication origin session globals into a single state struct.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Вложения

Re: Refactor replication origin state reset helpers

От
Chao Li
Дата:
On Tue, Dec 30, 2025 at 1:07 PM Chao Li <li.evan.chao@gmail.com> wrote:

On Tue, Dec 30, 2025 at 12:48 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Mon, Dec 29, 2025 at 8:14 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> On 2025-Dec-24, Ashutosh Bapat wrote:
>
> > If we go this route, we at least need to declare the new functions as
> > static inline and move them to a header file instead of .c file.
>
> Hmm, why would we make them static inline instead of standard (extern)
> functions?  We use static inline functions when we want to avoid the
> overhead of a function call in a hot code path, but I doubt that's the
> case here.  Am I mistaken on this?
>

I wasn't aware that we are using static inline only in hot code paths.
Looking around I see most of the static inline functions are from
modules which are used in hot code paths. So, yeah that seems to be
the convention. I also see some exceptions like those in
basebackup_sink.h - I don't think all of those are used in hot code
paths.

In this case, we are moving three assignments into their own
functions. CPU instructions to call extern functions will be
significant compared to CPU instructions for those assignments. static
inline functions, OTOH, would have similar performance as the existing
code while providing modularization. If you feel that's not a good
enough reason, I am ok keeping them extern.

> > Further, does it make sense to put together all the state variables
> > into a single structure?
>
> Yeah -- keeping the threaded-backend project in mind, moving them to a
> single struct seems to make sense.  I think it's a separate patch though
> because it'd be more invasive than Chao's initial patch, as those
> variables are used in many places.
>

Attached v3 patch set. Comparing to v2, the changes are:

0001:
* Combine the two cleanup functions into one and control them by a bool flag.
* Change the helper function to be extern.
* Move out cleanup from reset function.

0002: Consolidate replication origin session globals into a single state struct.
 
Fixed a bug in v4.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
Вложения