Обсуждение: logicalrep_message_type throws an error

Поиск
Список
Период
Сортировка

logicalrep_message_type throws an error

От
Ashutosh Bapat
Дата:
Hi All,
logicalrep_message_type() is used to convert logical message type code
into name while prepared error context or details. Thus when this
function is called probably an ERROR is already raised. If
logicalrep_message_type() gets an unknown message type, it will throw
an error, which will suppress the error for which we are building
context or details. That's not useful. I think instead
logicalrep_message_type() should return "unknown" when it encounters
an unknown message type and let the original error message be thrown
as is.

-- 
Best Wishes,
Ashutosh Bapat



Re: logicalrep_message_type throws an error

От
"Euler Taveira"
Дата:
On Mon, Jul 3, 2023, at 7:30 AM, Ashutosh Bapat wrote:
logicalrep_message_type() is used to convert logical message type code
into name while prepared error context or details. Thus when this
function is called probably an ERROR is already raised. If
logicalrep_message_type() gets an unknown message type, it will throw
an error, which will suppress the error for which we are building
context or details. That's not useful. I think instead
logicalrep_message_type() should return "unknown" when it encounters
an unknown message type and let the original error message be thrown
as is.

Hmm. Good catch. The current behavior is:

ERROR:  invalid logical replication message type "X" 
LOG:  background worker "logical replication worker" (PID 71800) exited with exit code 1

... that hides the details. After providing a default message type:

ERROR:  invalid logical replication message type "X"
CONTEXT:  processing remote data for replication origin "pg_16638" during message type "???" in transaction 796, finished at 0/16266F8

Masahiko, since abc0910e2e0 is your patch maybe you want to take a look at it.


--
Euler Taveira

Вложения

Re: logicalrep_message_type throws an error

От
Ashutosh Bapat
Дата:
Thanks Euler for the patch.

On Mon, Jul 3, 2023 at 6:32 PM Euler Taveira <euler@eulerto.com> wrote:
>
> Masahiko, since abc0910e2e0 is your patch maybe you want to take a look at it.
>

A couple of comments.

-char *
+const char *

Nice improvement.

 logicalrep_message_type(LogicalRepMsgType action)
 {
     switch (action)
@@ -1256,9 +1256,7 @@ logicalrep_message_type(LogicalRepMsgType action)
             return "STREAM ABORT";
         case LOGICAL_REP_MSG_STREAM_PREPARE:
             return "STREAM PREPARE";
+        default:
+            return "???";
     }
-
-    elog(ERROR, "invalid logical replication message type \"%c\"", action);
-
-    return NULL;                /* keep compiler quiet */

The switch is on action which is an enum. So without default it will
provide a compilation warning for missing enums. Adding "default" case
defeats that purpose. I think we should just return "???" from outside
switch block.

--
Best Wishes,
Ashutosh Bapat



Re: logicalrep_message_type throws an error

От
Ashutosh Bapat
Дата:
On Mon, Jul 3, 2023 at 6:52 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> Thanks Euler for the patch.
>
> On Mon, Jul 3, 2023 at 6:32 PM Euler Taveira <euler@eulerto.com> wrote:
> >
> > Masahiko, since abc0910e2e0 is your patch maybe you want to take a look at it.
> >
>
> A couple of comments.
>
> -char *
> +const char *
>
> Nice improvement.
>
>  logicalrep_message_type(LogicalRepMsgType action)
>  {
>      switch (action)
> @@ -1256,9 +1256,7 @@ logicalrep_message_type(LogicalRepMsgType action)
>              return "STREAM ABORT";
>          case LOGICAL_REP_MSG_STREAM_PREPARE:
>              return "STREAM PREPARE";
> +        default:
> +            return "???";
>      }
> -
> -    elog(ERROR, "invalid logical replication message type \"%c\"", action);
> -
> -    return NULL;                /* keep compiler quiet */
>
> The switch is on action which is an enum. So without default it will
> provide a compilation warning for missing enums. Adding "default" case
> defeats that purpose. I think we should just return "???" from outside
> switch block.
>

PFA patch.

--
Best Wishes,
Ashutosh Bapat

Вложения

Re: logicalrep_message_type throws an error

От
"Euler Taveira"
Дата:
On Mon, Jul 3, 2023, at 10:57 AM, Ashutosh Bapat wrote:
On Mon, Jul 3, 2023 at 6:52 PM Ashutosh Bapat
>
> The switch is on action which is an enum. So without default it will
> provide a compilation warning for missing enums. Adding "default" case
> defeats that purpose. I think we should just return "???" from outside
> switch block.
>

Yeah. I don't think any gcc -Wswitch options have effect if a default is used
so your suggestion is good for wrong/missing message types in the future.

PFA patch.

WFM.


--
Euler Taveira

Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Mon, Jul 3, 2023 at 6:32 PM Euler Taveira <euler@eulerto.com> wrote:
>
> On Mon, Jul 3, 2023, at 7:30 AM, Ashutosh Bapat wrote:
>
> logicalrep_message_type() is used to convert logical message type code
> into name while prepared error context or details. Thus when this
> function is called probably an ERROR is already raised. If
> logicalrep_message_type() gets an unknown message type, it will throw
> an error, which will suppress the error for which we are building
> context or details. That's not useful. I think instead
> logicalrep_message_type() should return "unknown" when it encounters
> an unknown message type and let the original error message be thrown
> as is.
>
>
> Hmm. Good catch. The current behavior is:
>
> ERROR:  invalid logical replication message type "X"
> LOG:  background worker "logical replication worker" (PID 71800) exited with exit code 1
>
> ... that hides the details. After providing a default message type:
>
> ERROR:  invalid logical replication message type "X"
> CONTEXT:  processing remote data for replication origin "pg_16638" during message type "???" in transaction 796,
finishedat 0/16266F8 
>

I think after returning "???" from logicalrep_message_type(), the
above is possible when we get the error: "invalid logical replication
message type "X"" from apply_dispatch(), right? If so, then what about
the case when we forgot to handle some message in
logicalrep_message_type() but handled it in apply_dispatch()? Isn't it
better to return the 'action' from the function
logicalrep_message_type() for unknown type? That way the information
could be a bit better and we can easily catch the code bug as well.

--
With Regards,
Amit Kapila.



Re: logicalrep_message_type throws an error

От
Alvaro Herrera
Дата:
On 2023-Jul-05, Amit Kapila wrote:

> I think after returning "???" from logicalrep_message_type(), the
> above is possible when we get the error: "invalid logical replication
> message type "X"" from apply_dispatch(), right? If so, then what about
> the case when we forgot to handle some message in
> logicalrep_message_type() but handled it in apply_dispatch()? Isn't it
> better to return the 'action' from the function
> logicalrep_message_type() for unknown type? That way the information
> could be a bit better and we can easily catch the code bug as well.

Are you suggesting that logicalrep_message_type should include the
numerical value of 'action' in the ??? message? Something like this:

 ERROR:  invalid logical replication message type "X"
 CONTEXT:  processing remote data for replication origin "pg_16638" during message type "??? (123)" in transaction 796,
finishedat 0/16266F8
 

I don't see why not -- seems easy enough, and might help somebody.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
Tom: There seems to be something broken here.
Teodor: I'm in sackcloth and ashes...  Fixed.
                               http://postgr.es/m/482D1632.8010507@sigaev.ru



Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Wed, Jul 5, 2023 at 4:26 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Jul-05, Amit Kapila wrote:
>
> > I think after returning "???" from logicalrep_message_type(), the
> > above is possible when we get the error: "invalid logical replication
> > message type "X"" from apply_dispatch(), right? If so, then what about
> > the case when we forgot to handle some message in
> > logicalrep_message_type() but handled it in apply_dispatch()? Isn't it
> > better to return the 'action' from the function
> > logicalrep_message_type() for unknown type? That way the information
> > could be a bit better and we can easily catch the code bug as well.
>
> Are you suggesting that logicalrep_message_type should include the
> numerical value of 'action' in the ??? message? Something like this:
>
>  ERROR:  invalid logical replication message type "X"
>  CONTEXT:  processing remote data for replication origin "pg_16638" during message type "??? (123)" in transaction
796,finished at 0/16266F8 
>

Yes, something like that would be better.

--
With Regards,
Amit Kapila.



Re: logicalrep_message_type throws an error

От
"Euler Taveira"
Дата:
On Wed, Jul 5, 2023, at 7:56 AM, Alvaro Herrera wrote:
On 2023-Jul-05, Amit Kapila wrote:

> I think after returning "???" from logicalrep_message_type(), the
> above is possible when we get the error: "invalid logical replication
> message type "X"" from apply_dispatch(), right? If so, then what about
> the case when we forgot to handle some message in
> logicalrep_message_type() but handled it in apply_dispatch()? Isn't it
> better to return the 'action' from the function
> logicalrep_message_type() for unknown type? That way the information
> could be a bit better and we can easily catch the code bug as well.

Are you suggesting that logicalrep_message_type should include the
numerical value of 'action' in the ??? message? Something like this:

ERROR:  invalid logical replication message type "X"
CONTEXT:  processing remote data for replication origin "pg_16638" during message type "??? (123)" in transaction 796, finished at 0/16266F8

Isn't this numerical value already exposed in the error message (X = 88)?
In this example, it is:

ERROR:  invalid logical replication message type "X"
CONTEXT:  processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction 796, finished at 0/1626698

IMO it could be confusing if we provide two representations of the same data (X
and 88). Since we already provide "X" I don't think we need "88". Another
option, is to remove "X" from apply_dispatch() and add "??? (88)" to
logicalrep_message_type().

Opinions?


--
Euler Taveira

Re: logicalrep_message_type throws an error

От
Ashutosh Bapat
Дата:
On Wed, Jul 5, 2023 at 7:07 PM Euler Taveira <euler@eulerto.com> wrote:
>
> On Wed, Jul 5, 2023, at 7:56 AM, Alvaro Herrera wrote:
>
> On 2023-Jul-05, Amit Kapila wrote:
>
> > I think after returning "???" from logicalrep_message_type(), the
> > above is possible when we get the error: "invalid logical replication
> > message type "X"" from apply_dispatch(), right? If so, then what about
> > the case when we forgot to handle some message in
> > logicalrep_message_type() but handled it in apply_dispatch()?

apply_dispatch() has a default case in switch() so it can
theoretically forget to handle some message type. I think we should
avoid default case in that function to catch missing message type in
that function. But if logicalrep_message_type() doesn't use default
case, it won't forget to handle a known message type. So what you are
suggesting is not possible.

It might happen that the upstream may send an unknown message type
that both apply_dispatch() and logicalrep_message_type() can not
handle.

> ERROR:  invalid logical replication message type "X"
> CONTEXT:  processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction 796,
finishedat 0/1626698 
>
> IMO it could be confusing if we provide two representations of the same data (X
> and 88). Since we already provide "X" I don't think we need "88". Another
> option, is to remove "X" from apply_dispatch() and add "??? (88)" to
> logicalrep_message_type().

I think we don't need message type to be mentioned in the context for
an error about invalid message type. I think what needs to be done is
to set
apply_error_callback_arg.command = 0 before calling ereport in the
default case of apply_dispatch().
apply_error_callback() will just return without providing a context.
If we need a context and have all the other necessary fields, we can
improve apply_error_callback() to provide context when
apply_error_callback_arg.command = 0 .


--
Best Wishes,
Ashutosh Bapat



Re: logicalrep_message_type throws an error

От
Alvaro Herrera
Дата:
On 2023-Jul-05, Euler Taveira wrote:

> Isn't this numerical value already exposed in the error message (X = 88)?
> In this example, it is:
> 
> ERROR:  invalid logical replication message type "X"
> CONTEXT:  processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction 796,
finishedat 0/1626698
 
> 
> IMO it could be confusing if we provide two representations of the same data (X
> and 88). Since we already provide "X" I don't think we need "88". Another
> option, is to remove "X" from apply_dispatch() and add "??? (88)" to 
> logicalrep_message_type().
> 
> Opinions?

The CONTEXT message could theoretically appear in other error throws,
not just in "invalid logical replication message".  So the duplicity is
not really a problem.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Now I have my system running, not a byte was off the shelf;
It rarely breaks and when it does I fix the code myself.
It's stable, clean and elegant, and lightning fast as well,
And it doesn't cost a nickel, so Bill Gates can go to hell."



Re: logicalrep_message_type throws an error

От
Alvaro Herrera
Дата:
On 2023-Jul-05, Alvaro Herrera wrote:

> On 2023-Jul-05, Euler Taveira wrote:
> 
> > Isn't this numerical value already exposed in the error message (X = 88)?
> > In this example, it is:
> > 
> > ERROR:  invalid logical replication message type "X"
> > CONTEXT:  processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction
796,finished at 0/1626698
 
> > 
> > IMO it could be confusing if we provide two representations of the same data (X
> > and 88). Since we already provide "X" I don't think we need "88".
> 
> The CONTEXT message could theoretically appear in other error throws,
> not just in "invalid logical replication message".  So the duplicity is
> not really a problem.

Ah, but you're thinking that if the message type is invalid, then it
will have been rejected in the "invalid logical replication message"
stage, so no invalid message type will be reported.  I guess there's a
point to that argument as well.

However, I don't see that having the numerical ASCII value there causes
any harm, even if the char value is already exposed in the other
message.  (I'm sure you'll agree that this is quite a minor issue.)

I doubt that each of these two prints of the enum value showing
different formats is confusing.  Yes, the enum is defined in terms of
char literals, but if an actually invalid message shows up with an
uint32 value outside the printable ASCII range, the printout might
be ugly or useless.

> > Another option, is to remove "X" from apply_dispatch() and add "???
> > (88)" to logicalrep_message_type().

Now *this* would be an actively bad idea IMO.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Wed, Jul 5, 2023 at 10:45 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Jul-05, Alvaro Herrera wrote:
>
> > On 2023-Jul-05, Euler Taveira wrote:
> >
> > > Isn't this numerical value already exposed in the error message (X = 88)?
> > > In this example, it is:
> > >
> > > ERROR:  invalid logical replication message type "X"
> > > CONTEXT:  processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction
796,finished at 0/1626698 
> > >
> > > IMO it could be confusing if we provide two representations of the same data (X
> > > and 88). Since we already provide "X" I don't think we need "88".
> >
> > The CONTEXT message could theoretically appear in other error throws,
> > not just in "invalid logical replication message".  So the duplicity is
> > not really a problem.
>
> Ah, but you're thinking that if the message type is invalid, then it
> will have been rejected in the "invalid logical replication message"
> stage, so no invalid message type will be reported.
>

Yeah, but it would still be displayed both in context and the actual message.

>  I guess there's a
> point to that argument as well.
>

One point to note is that the user may also get confused if the actual
ERROR says message type as 'X' and the context says '???'. I feel in
this case duplicate information is better than different information.

--
With Regards,
Amit Kapila.



Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Wed, Jul 5, 2023 at 7:54 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Wed, Jul 5, 2023 at 7:07 PM Euler Taveira <euler@eulerto.com> wrote:
> >
> > On Wed, Jul 5, 2023, at 7:56 AM, Alvaro Herrera wrote:
> >
> > On 2023-Jul-05, Amit Kapila wrote:
> >
> > > I think after returning "???" from logicalrep_message_type(), the
> > > above is possible when we get the error: "invalid logical replication
> > > message type "X"" from apply_dispatch(), right? If so, then what about
> > > the case when we forgot to handle some message in
> > > logicalrep_message_type() but handled it in apply_dispatch()?
>
> apply_dispatch() has a default case in switch() so it can
> theoretically forget to handle some message type. I think we should
> avoid default case in that function to catch missing message type in
> that function. But if logicalrep_message_type() doesn't use default
> case, it won't forget to handle a known message type. So what you are
> suggesting is not possible.
>

Right, but still I feel it would be better to return actual action.

> It might happen that the upstream may send an unknown message type
> that both apply_dispatch() and logicalrep_message_type() can not
> handle.
>
> > ERROR:  invalid logical replication message type "X"
> > CONTEXT:  processing remote data for replication origin "pg_16638" during message type "??? (88)" in transaction
796,finished at 0/1626698 
> >
> > IMO it could be confusing if we provide two representations of the same data (X
> > and 88). Since we already provide "X" I don't think we need "88". Another
> > option, is to remove "X" from apply_dispatch() and add "??? (88)" to
> > logicalrep_message_type().
>
> I think we don't need message type to be mentioned in the context for
> an error about invalid message type. I think what needs to be done is
> to set
> apply_error_callback_arg.command = 0 before calling ereport in the
> default case of apply_dispatch().
> apply_error_callback() will just return without providing a context.
>

Hmm, this looks a bit hacky to me.

--
With Regards,
Amit Kapila.



Re: logicalrep_message_type throws an error

От
Masahiko Sawada
Дата:
On Thu, Jul 6, 2023 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> One point to note is that the user may also get confused if the actual
> ERROR says message type as 'X' and the context says '???'. I feel in
> this case duplicate information is better than different information.

I agree. I think it would be better to show the same string like:

ERROR:  invalid logical replication message type "??? (88)"
CONTEXT:  processing remote data for replication origin "pg_16638"
during message type "??? (88)" in transaction 796, finished at
0/1626698

Since the numerical value is important only in invalid message type
cases, how about using a format like "??? (88)" in unknown message
type cases, in both error and context messages?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: logicalrep_message_type throws an error

От
Alvaro Herrera
Дата:
On 2023-Jul-11, Masahiko Sawada wrote:

> Since the numerical value is important only in invalid message type
> cases, how about using a format like "??? (88)" in unknown message
> type cases, in both error and context messages?

+1

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Las navajas y los monos deben estar siempre distantes"   (Germán Poo)



Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Tue, Jul 11, 2023 at 1:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jul 6, 2023 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > One point to note is that the user may also get confused if the actual
> > ERROR says message type as 'X' and the context says '???'. I feel in
> > this case duplicate information is better than different information.
>
> I agree. I think it would be better to show the same string like:
>
> ERROR:  invalid logical replication message type "??? (88)"
> CONTEXT:  processing remote data for replication origin "pg_16638"
> during message type "??? (88)" in transaction 796, finished at
> 0/1626698
>
> Since the numerical value is important only in invalid message type
> cases, how about using a format like "??? (88)" in unknown message
> type cases, in both error and context messages?
>

Do you have something like attached in mind?

--
With Regards,
Amit Kapila.

Вложения

Re: logicalrep_message_type throws an error

От
"Euler Taveira"
Дата:
On Sat, Jul 15, 2023, at 4:27 AM, Amit Kapila wrote:
Do you have something like attached in mind?

WFM. I would change the comment that says

This function is called to provide context in the error ...

to

This message provides context in the error ...

because this comment is not at the beginning of the function but *before* the
message.


--
Euler Taveira

Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Sat, Jul 15, 2023 at 7:16 PM Euler Taveira <euler@eulerto.com> wrote:
>
> On Sat, Jul 15, 2023, at 4:27 AM, Amit Kapila wrote:
>
> Do you have something like attached in mind?
>
>
> WFM. I would change the comment that says
>
> This function is called to provide context in the error ...
>
> to
>
> This message provides context in the error ...
>
> because this comment is not at the beginning of the function but *before* the
> message.
>

Sounds reasonable to me. I'll make this modification before pushing
unless there are more comments/suggestions.

--
With Regards,
Amit Kapila.



Re: logicalrep_message_type throws an error

От
Ashutosh Bapat
Дата:
On Sat, Jul 15, 2023 at 12:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Since the numerical value is important only in invalid message type
> > cases, how about using a format like "??? (88)" in unknown message
> > type cases, in both error and context messages?
> >
>
> Do you have something like attached in mind?

Prologue of psprintf() says

* Errors are not returned to the caller, but are reported via elog(ERROR)
* in the backend, or printf-to-stderr-and-exit() in frontend builds.
* One should therefore think twice about using this in libpq.

If an error occurs in psprintf(), it will throw an error which will
override the original error. I think we should avoid any stuff that
throws further errors.

--
Best Wishes,
Ashutosh Bapat



Re: logicalrep_message_type throws an error

От
Alvaro Herrera
Дата:
On 2023-Jul-17, Ashutosh Bapat wrote:

> Prologue of psprintf() says
> 
> * Errors are not returned to the caller, but are reported via elog(ERROR)
> * in the backend, or printf-to-stderr-and-exit() in frontend builds.
> * One should therefore think twice about using this in libpq.
> 
> If an error occurs in psprintf(), it will throw an error which will
> override the original error. I think we should avoid any stuff that
> throws further errors.

Ooh, yeah, this is an excellent point.  I agree it would be better to
avoid psprintf() here and anything that adds more failure modes.  Let's
just do the thing in the original patch you submitted, to ensure all
these strings are compile-time constants; that's likely the most robust.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Pensar que el espectro que vemos es ilusorio no lo despoja de espanto,
sólo le suma el nuevo terror de la locura" (Perelandra, C.S. Lewis)



Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Mon, Jul 17, 2023 at 7:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Jul-17, Ashutosh Bapat wrote:
>
> > Prologue of psprintf() says
> >
> > * Errors are not returned to the caller, but are reported via elog(ERROR)
> > * in the backend, or printf-to-stderr-and-exit() in frontend builds.
> > * One should therefore think twice about using this in libpq.
> >
> > If an error occurs in psprintf(), it will throw an error which will
> > override the original error. I think we should avoid any stuff that
> > throws further errors.
>
> Ooh, yeah, this is an excellent point.  I agree it would be better to
> avoid psprintf() here and anything that adds more failure modes.
>

I have tried to check whether we have such usage in any other error
callbacks. Though I haven't scrutinized each and every error callback,
I found a few of them where an error can be raised. For example,

rm_redo_error_callback()->initStringInfo()
CopyFromErrorCallback()->limit_printout_length()
shared_buffer_write_error_callback()->relpathperm()->relpathbackend()->GetRelationPath()->psprintf()

>  Let's
> just do the thing in the original patch you submitted, to ensure all
> these strings are compile-time constants; that's likely the most robust.
>

So will we be okay with something like the below:

ERROR:  invalid logical replication message type "??? (88)"
CONTEXT:  processing remote data for replication origin "pg_16638"
during message type "???" in transaction 796, finished at
0/1626698

--
With Regards,
Amit Kapila.



Re: logicalrep_message_type throws an error

От
Masahiko Sawada
Дата:
On Tue, Jul 18, 2023 at 12:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jul 17, 2023 at 7:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2023-Jul-17, Ashutosh Bapat wrote:
> >
> > > Prologue of psprintf() says
> > >
> > > * Errors are not returned to the caller, but are reported via elog(ERROR)
> > > * in the backend, or printf-to-stderr-and-exit() in frontend builds.
> > > * One should therefore think twice about using this in libpq.
> > >
> > > If an error occurs in psprintf(), it will throw an error which will
> > > override the original error. I think we should avoid any stuff that
> > > throws further errors.
> >
> > Ooh, yeah, this is an excellent point.  I agree it would be better to
> > avoid psprintf() here and anything that adds more failure modes.
> >
>
> I have tried to check whether we have such usage in any other error
> callbacks. Though I haven't scrutinized each and every error callback,
> I found a few of them where an error can be raised. For example,
>
> rm_redo_error_callback()->initStringInfo()
> CopyFromErrorCallback()->limit_printout_length()
> shared_buffer_write_error_callback()->relpathperm()->relpathbackend()->GetRelationPath()->psprintf()
>
> >  Let's
> > just do the thing in the original patch you submitted, to ensure all
> > these strings are compile-time constants; that's likely the most robust.
> >

Or can we use snprintf() writing "??? (%d)" to a fixed length char[8 +
11] allocated on the stack instead?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Tue, Jul 18, 2023 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jul 18, 2023 at 12:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Jul 17, 2023 at 7:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > >
> >
> > I have tried to check whether we have such usage in any other error
> > callbacks. Though I haven't scrutinized each and every error callback,
> > I found a few of them where an error can be raised. For example,
> >
> > rm_redo_error_callback()->initStringInfo()
> > CopyFromErrorCallback()->limit_printout_length()
> > shared_buffer_write_error_callback()->relpathperm()->relpathbackend()->GetRelationPath()->psprintf()
> >
> > >  Let's
> > > just do the thing in the original patch you submitted, to ensure all
> > > these strings are compile-time constants; that's likely the most robust.
> > >
>
> Or can we use snprintf() writing "??? (%d)" to a fixed length char[8 +
> 11] allocated on the stack instead?
>

In the above size calculation, shouldn't it be 7 + 11 where 7 is for
(3 (???) + 1 for space + 2 for () + 1 for terminating null char) and
11 is for %d? BTW, this avoids dynamic allocation of the err string in
logicalrep_message_type() but we can't return a locally allocated
string, so do you think we should change the prototype of the function
to get this as an argument and then use it both for valid and invalid
cases? I think if there is some simpler way to achieve this then fine,
otherwise, let's return a constant string like "???" from
logicalrep_message_type() for the invalid action.

--
With Regards,
Amit Kapila.



Re: logicalrep_message_type throws an error

От
Ashutosh Bapat
Дата:
On Wed, Jul 19, 2023 at 9:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jul 18, 2023 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jul 18, 2023 at 12:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Jul 17, 2023 at 7:54 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> > > >
> > >
> > > I have tried to check whether we have such usage in any other error
> > > callbacks. Though I haven't scrutinized each and every error callback,
> > > I found a few of them where an error can be raised. For example,
> > >
> > > rm_redo_error_callback()->initStringInfo()
> > > CopyFromErrorCallback()->limit_printout_length()
> > > shared_buffer_write_error_callback()->relpathperm()->relpathbackend()->GetRelationPath()->psprintf()
> > >
> > > >  Let's
> > > > just do the thing in the original patch you submitted, to ensure all
> > > > these strings are compile-time constants; that's likely the most robust.
> > > >
> >
> > Or can we use snprintf() writing "??? (%d)" to a fixed length char[8 +
> > 11] allocated on the stack instead?
> >
>
> In the above size calculation, shouldn't it be 7 + 11 where 7 is for
> (3 (???) + 1 for space + 2 for () + 1 for terminating null char) and
> 11 is for %d? BTW, this avoids dynamic allocation of the err string in
> logicalrep_message_type() but we can't return a locally allocated
> string, so do you think we should change the prototype of the function
> to get this as an argument and then use it both for valid and invalid
> cases?

There are other places in the code which do something similar by using
statically allocated buffers like static char xya[SIZE]. We could do
that here. The caller may decide whether to pstrdup this buffer
further or just use it one time e.g. as an elog or printf argument.

As I said before, we should not even print message type in the error
context because it's unknown. Repeating that twice is useless. That
will need some changes to apply_error_callback() though.
But I am fine with "???" as well.

--
Best Wishes,
Ashutosh Bapat



Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Wed, Jul 19, 2023 at 10:08 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Wed, Jul 19, 2023 at 9:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jul 18, 2023 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Or can we use snprintf() writing "??? (%d)" to a fixed length char[8 +
> > > 11] allocated on the stack instead?
> > >
> >
> > In the above size calculation, shouldn't it be 7 + 11 where 7 is for
> > (3 (???) + 1 for space + 2 for () + 1 for terminating null char) and
> > 11 is for %d? BTW, this avoids dynamic allocation of the err string in
> > logicalrep_message_type() but we can't return a locally allocated
> > string, so do you think we should change the prototype of the function
> > to get this as an argument and then use it both for valid and invalid
> > cases?
>
> There are other places in the code which do something similar by using
> statically allocated buffers like static char xya[SIZE]. We could do
> that here. The caller may decide whether to pstrdup this buffer
> further or just use it one time e.g. as an elog or printf argument.
>

Okay, changed it accordingly. Currently, we call it only from
errcontext, so it looks reasonable to me to use static here.

> As I said before, we should not even print message type in the error
> context because it's unknown. Repeating that twice is useless. That
> will need some changes to apply_error_callback() though.
> But I am fine with "???" as well.
>

I think in the end it won't make a big difference. So, I would like to
go with Sawada-San's suggestion to keep the message type consistent in
actual error and error context unless that requires bigger changes.

--
With Regards,
Amit Kapila.



Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Thu, Jul 20, 2023 at 9:10 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jul 19, 2023 at 10:08 AM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Wed, Jul 19, 2023 at 9:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Jul 18, 2023 at 10:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Or can we use snprintf() writing "??? (%d)" to a fixed length char[8 +
> > > > 11] allocated on the stack instead?
> > > >
> > >
> > > In the above size calculation, shouldn't it be 7 + 11 where 7 is for
> > > (3 (???) + 1 for space + 2 for () + 1 for terminating null char) and
> > > 11 is for %d? BTW, this avoids dynamic allocation of the err string in
> > > logicalrep_message_type() but we can't return a locally allocated
> > > string, so do you think we should change the prototype of the function
> > > to get this as an argument and then use it both for valid and invalid
> > > cases?
> >
> > There are other places in the code which do something similar by using
> > statically allocated buffers like static char xya[SIZE]. We could do
> > that here. The caller may decide whether to pstrdup this buffer
> > further or just use it one time e.g. as an elog or printf argument.
> >
>
> Okay, changed it accordingly.
>

oops, forgot to attach the patch.

--
With Regards,
Amit Kapila.

Вложения

Re: logicalrep_message_type throws an error

От
Ashutosh Bapat
Дата:
On Thu, Jul 20, 2023 at 9:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Okay, changed it accordingly.
> >
>
> oops, forgot to attach the patch.
>

WFM

@@ -3367,7 +3367,7 @@ apply_dispatch(StringInfo s)
         default:
             ereport(ERROR,
                     (errcode(ERRCODE_PROTOCOL_VIOLATION),
-                     errmsg("invalid logical replication message type
\"%c\"", action)));
+                     errmsg("invalid logical replication message type
\"??? (%d)\"", action)));

I think we should report character here since that's what is visible
in the code and also the message types are communicated as characters
not integers. Makes debugging easier. Context may report integer as an
additional data point.

--
Best Wishes,
Ashutosh Bapat



Re: logicalrep_message_type throws an error

От
Masahiko Sawada
Дата:
On Fri, Jul 21, 2023 at 1:39 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Thu, Jul 20, 2023 at 9:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > Okay, changed it accordingly.
> > >
> >
> > oops, forgot to attach the patch.
> >
>
> WFM
>
> @@ -3367,7 +3367,7 @@ apply_dispatch(StringInfo s)
>          default:
>              ereport(ERROR,
>                      (errcode(ERRCODE_PROTOCOL_VIOLATION),
> -                     errmsg("invalid logical replication message type
> \"%c\"", action)));
> +                     errmsg("invalid logical replication message type
> \"??? (%d)\"", action)));
>
> I think we should report character here since that's what is visible
> in the code and also the message types are communicated as characters
> not integers. Makes debugging easier. Context may report integer as an
> additional data point.

I think it could confuse us if an invalid message is not a printable character.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Fri, Jul 21, 2023 at 6:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jul 21, 2023 at 1:39 AM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Thu, Jul 20, 2023 at 9:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > Okay, changed it accordingly.
> > > >
> > >
> > > oops, forgot to attach the patch.
> > >
> >
> > WFM
> >
> > @@ -3367,7 +3367,7 @@ apply_dispatch(StringInfo s)
> >          default:
> >              ereport(ERROR,
> >                      (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > -                     errmsg("invalid logical replication message type
> > \"%c\"", action)));
> > +                     errmsg("invalid logical replication message type
> > \"??? (%d)\"", action)));
> >
> > I think we should report character here since that's what is visible
> > in the code and also the message types are communicated as characters
> > not integers. Makes debugging easier. Context may report integer as an
> > additional data point.
>
> I think it could confuse us if an invalid message is not a printable character.
>

Right. I'll push and backpatch this till 15 by Tuesday unless you guys
think otherwise.


--
With Regards,
Amit Kapila.



Re: logicalrep_message_type throws an error

От
Ashutosh Bapat
Дата:
On Sat, Jul 22, 2023 at 10:18 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

> >
> > I think it could confuse us if an invalid message is not a printable character.
> >
>

That's a good point.

> Right. I'll push and backpatch this till 15 by Tuesday unless you guys
> think otherwise.

WFM.

--
Best Wishes,
Ashutosh Bapat



Re: logicalrep_message_type throws an error

От
Amit Kapila
Дата:
On Mon, Jul 24, 2023 at 6:14 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Sat, Jul 22, 2023 at 10:18 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > Right. I'll push and backpatch this till 15 by Tuesday unless you guys
> > think otherwise.
>
> WFM.
>

Pushed.

--
With Regards,
Amit Kapila.