Обсуждение: Remove useless casting to the same type

Поиск
Список
Период
Сортировка

Remove useless casting to the same type

От
Bertrand Drouvot
Дата:
Hi hackers,

Attached is a patch to $SUBJECT.

This is the same kind of idea as 7f798aca1d5 and ef8fe693606, as their presence
could cause risks of hiding actual type mismatches in the future or silently
discarding qualifiers. I think that it also improves readability.

Those have been found with a coccinelle script as:

"
@@
type T;
T E;
@@

- (T)
  E
"

which removes casts when it's casting to the same type.

Note that it generated more that what is in the attached. I chose not to remove
some of them (mainly arithmetic ones) to keep the patch focused on what matters
here.

Thoughts?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

Re: Remove useless casting to the same type

От
Nathan Bossart
Дата:
On Mon, Nov 24, 2025 at 10:26:32AM +0000, Bertrand Drouvot wrote:
> This is the same kind of idea as 7f798aca1d5 and ef8fe693606, as their presence
> could cause risks of hiding actual type mismatches in the future or silently
> discarding qualifiers. I think that it also improves readability.

Seems reasonable to me.

> Note that it generated more that what is in the attached. I chose not to remove
> some of them (mainly arithmetic ones) to keep the patch focused on what matters
> here.

Can you give an example of what you are talking about here?

-- 
nathan



Re: Remove useless casting to the same type

От
Jacob Champion
Дата:
On Mon, Nov 24, 2025 at 2:26 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
> Attached is a patch to $SUBJECT.

> --- a/src/backend/access/gin/gindatapage.c
> +++ b/src/backend/access/gin/gindatapage.c
> @@ -607,11 +607,11 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
>
>   if (append)
>   elog(DEBUG2, "appended %d new items to block %u; %d bytes (%d to go)",
> - maxitems, BufferGetBlockNumber(buf), (int) leaf->lsize,
> + maxitems, BufferGetBlockNumber(buf), leaf->lsize,
>   items->nitem - items->curitem - maxitems);

Hm. How do we feel, as a group, about superstitious casts in variadic
calls? I don't feel like they're in the same class as the other fixes.

Argument for: it's nice to know at a glance that a printf() invocation
won't corrupt something horribly, especially if I'm quickly scanning
code during a CVE analysis, and especially if the variable is named as
if it could maybe be a size_t. Do our compilers warn us well enough
now, in practice?

Argument against: it takes up precious columns and focuses attention
away from other things. Like the fact that (items->nitem -
items->curitem - maxitems) is unsigned and printed as signed here. :D
Maybe we should make the code compile cleanly under
-Wformat-signedness at some point... (not in this patch, not signing
you up for that)

> @@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record)
>
>   /* extract low and high masks. */
>   memcpy(&lowmask, data, sizeof(uint32));
> - highmask = (uint32 *) ((char *) data + sizeof(uint32));
> + highmask = (uint32 *) (data + sizeof(uint32));

I wonder about these, too. I like knowing what the code does without
having to check the type of `data`. But then later on we do a `data +=
sizeof(uint32) * 2`, so you have to check the type anyway, so... I
don't know.

> +++ b/contrib/fuzzystrmatch/dmetaphone.c
> @@ -327,7 +327,7 @@ GetAt(metastring *s, int pos)
>   if ((pos < 0) || (pos >= s->length))
>   return '\0';
>
> - return ((char) *(s->str + pos));
> + return *(s->str + pos);

Isn't this just s->str[pos]? Ditto for SetAt(), right afterwards.

--Jacob



Re: Remove useless casting to the same type

От
Bertrand Drouvot
Дата:
Hi,

On Mon, Nov 24, 2025 at 12:43:28PM -0600, Nathan Bossart wrote:
> On Mon, Nov 24, 2025 at 10:26:32AM +0000, Bertrand Drouvot wrote:
> > This is the same kind of idea as 7f798aca1d5 and ef8fe693606, as their presence
> > could cause risks of hiding actual type mismatches in the future or silently
> > discarding qualifiers. I think that it also improves readability.
> 
> Seems reasonable to me.

Thanks for looking at it!

> > Note that it generated more that what is in the attached. I chose not to remove
> > some of them (mainly arithmetic ones) to keep the patch focused on what matters
> > here.
> 
> Can you give an example of what you are talking about here?

Things like:

A)

- int   k = (int) (targrows * sampler_random_fract(&rstate.randstate));
+ int   k = (targrows * sampler_random_fract(&rstate.randstate));

That's a valid cast removal but I'm not sure about the removal added value here.

B)
- sign = (BITVECP) (((char *) sign) + 1);
+ sign = ((sign) + 1);

BITVECP is "typedef char *BITVECP; So that's a valid cast removal but I decided
to keep it for semantic reason.

Same as:

@@ -2277,7 +2277,7 @@ printTrgmColor(StringInfo buf, TrgmColor co)
        else if (co == COLOR_BLANK)
                appendStringInfoChar(buf, 'b');
        else
-               appendStringInfo(buf, "%d", (int) co);
+               appendStringInfo(buf, "%d", co);

where TrgmColor is "typedef int TrgmColor;"

C)

-  if (((unsigned char *) base)[i] != 0xff)
+  if ((base)[i] != 0xff)

because not safe.

See attached the full list that I decided not to include.
Do you think we should add some of them in the patch? (maybe the ones in 
nbtcompare.c for example)

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Вложения

Re: Remove useless casting to the same type

От
Bertrand Drouvot
Дата:
Hi,

On Mon, Nov 24, 2025 at 03:18:00PM -0800, Jacob Champion wrote:
> On Mon, Nov 24, 2025 at 2:26 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> > Attached is a patch to $SUBJECT.
> 
> > --- a/src/backend/access/gin/gindatapage.c
> > +++ b/src/backend/access/gin/gindatapage.c
> > @@ -607,11 +607,11 @@ dataBeginPlaceToPageLeaf(GinBtree btree, Buffer buf, GinBtreeStack *stack,
> >
> >   if (append)
> >   elog(DEBUG2, "appended %d new items to block %u; %d bytes (%d to go)",
> > - maxitems, BufferGetBlockNumber(buf), (int) leaf->lsize,
> > + maxitems, BufferGetBlockNumber(buf), leaf->lsize,
> >   items->nitem - items->curitem - maxitems);
> 
> Hm. How do we feel, as a group, about superstitious casts in variadic
> calls? I don't feel like they're in the same class as the other fixes.
> 
> Argument for: it's nice to know at a glance that a printf() invocation
> won't corrupt something horribly, especially if I'm quickly scanning
> code during a CVE analysis, and especially if the variable is named as
> if it could maybe be a size_t. Do our compilers warn us well enough
> now, in practice?
> 
> Argument against: it takes up precious columns and focuses attention
> away from other things.

Thanks for looking at it!

I think that the variadic calls in the patch are related to functions that
can benefits from -Wformat. Let's focus on those: with the cast one would need
to verify 3 things: variable type, cast and format specifier.
Without the cast then only 2 things and the compiler can verify these match via
-Wformat warnings.

With the cast, the compiler only checks that the cast result matches the format,
not whether the cast itself is correct, so I'm in favor of removing the cast,
thoughts?

> Like the fact that (items->nitem -
> items->curitem - maxitems) is unsigned and printed as signed here. :D

Nice catch! ;-) 

> Maybe we should make the code compile cleanly under
> -Wformat-signedness at some point...

good idea, will give it a try later on outside the context of this patch.

> > @@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record)
> >
> >   /* extract low and high masks. */
> >   memcpy(&lowmask, data, sizeof(uint32));
> > - highmask = (uint32 *) ((char *) data + sizeof(uint32));
> > + highmask = (uint32 *) (data + sizeof(uint32));
> 
> I wonder about these, too. I like knowing what the code does without
> having to check the type of `data`. But then later on we do a `data +=
> sizeof(uint32) * 2`, so you have to check the type anyway, so... I
> don't know.

I think that even with the cast in place, it's good to check the type of data.
Not for the line that follows (i.e: "data += sizeof(uint32) * 2") but to check
that the cast makes sense and does not hide "wrong" pointer manipulation.

So I think that with or without the cast one would need to check. But that feels
more natural to check when there is no cast (as we don't assume that someone
said "I know what I'm doing"). So I'm in favor of removing the cast, thoughts?

> > +++ b/contrib/fuzzystrmatch/dmetaphone.c
> > @@ -327,7 +327,7 @@ GetAt(metastring *s, int pos)
> >   if ((pos < 0) || (pos >= s->length))
> >   return '\0';
> >
> > - return ((char) *(s->str + pos));
> > + return *(s->str + pos);
> 
> Isn't this just s->str[pos]? Ditto for SetAt(), right afterwards.

Yeah, "*(s->str + pos)" is already used in SetAt() and also in IsVowel(). Instead
of changing those 3, I'd prefer to keep the current change and keep the patch
focus on its intend. We could change those in a dedicated patch afterward if we
feel the need.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Remove useless casting to the same type

От
David Geier
Дата:
On 25.11.2025 06:46, Bertrand Drouvot wrote:

> I think that the variadic calls in the patch are related to functions that
> can benefits from -Wformat. Let's focus on those: with the cast one would need
> to verify 3 things: variable type, cast and format specifier.
> Without the cast then only 2 things and the compiler can verify these match via
> -Wformat warnings.
> 
> With the cast, the compiler only checks that the cast result matches the format,
> not whether the cast itself is correct, so I'm in favor of removing the cast,
> thoughts?
> 
I agree. It's better if we only have the casts in places where we
actually want to change the type before printing.

Another benefit is that one can directly deduct the type from the log /
print statement by looking at the format specifier.

--
David Geier



Re: Remove useless casting to the same type

От
Peter Eisentraut
Дата:
On 25.11.25 06:46, Bertrand Drouvot wrote:
>>> @@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record)
>>>
>>>    /* extract low and high masks. */
>>>    memcpy(&lowmask, data, sizeof(uint32));
>>> - highmask = (uint32 *) ((char *) data + sizeof(uint32));
>>> + highmask = (uint32 *) (data + sizeof(uint32));
>> I wonder about these, too. I like knowing what the code does without
>> having to check the type of `data`. But then later on we do a `data +=
>> sizeof(uint32) * 2`, so you have to check the type anyway, so... I
>> don't know.
> I think that even with the cast in place, it's good to check the type of data.
> Not for the line that follows (i.e: "data += sizeof(uint32) * 2") but to check
> that the cast makes sense and does not hide "wrong" pointer manipulation.
> 
> So I think that with or without the cast one would need to check. But that feels
> more natural to check when there is no cast (as we don't assume that someone
> said "I know what I'm doing"). So I'm in favor of removing the cast, thoughts?

I think this whole thing could be simplified by overlaying a uint32 over 
"data" and just accessing the array fields normally.  See attached patch.

Вложения

Re: Remove useless casting to the same type

От
Peter Eisentraut
Дата:
On 25.11.25 06:46, Bertrand Drouvot wrote:
>> Hm. How do we feel, as a group, about superstitious casts in variadic
>> calls? I don't feel like they're in the same class as the other fixes.
>>
>> Argument for: it's nice to know at a glance that a printf() invocation
>> won't corrupt something horribly, especially if I'm quickly scanning
>> code during a CVE analysis, and especially if the variable is named as
>> if it could maybe be a size_t. Do our compilers warn us well enough
>> now, in practice?
>>
>> Argument against: it takes up precious columns and focuses attention
>> away from other things.
> Thanks for looking at it!
> 
> I think that the variadic calls in the patch are related to functions that
> can benefits from -Wformat. Let's focus on those: with the cast one would need
> to verify 3 things: variable type, cast and format specifier.
> Without the cast then only 2 things and the compiler can verify these match via
> -Wformat warnings.
> 
> With the cast, the compiler only checks that the cast result matches the format,
> not whether the cast itself is correct, so I'm in favor of removing the cast,
> thoughts?

Yes, I think with -Wformat you get approximately the same level of 
protection as with a prototype.

>> Like the fact that (items->nitem -
>> items->curitem - maxitems) is unsigned and printed as signed here. :D
> Nice catch! ;-)
> 
>> Maybe we should make the code compile cleanly under
>> -Wformat-signedness at some point...

That would be similar to using -Wsign-conversion with function 
prototypes.  Maybe a good idea, but we don't use it, and so we shouldn't 
expect it for non-prototype invocations either.




Re: Remove useless casting to the same type

От
Peter Eisentraut
Дата:
On 25.11.25 06:46, Bertrand Drouvot wrote:
>> Maybe we should make the code compile cleanly under
>> -Wformat-signedness at some point...
> good idea, will give it a try later on outside the context of this patch.

I test this once in a while and fix the issues that I find.  But it's 
very picky and you will find difficult to fix problems like the fact 
that the signedness of enums is implementation-defined, and so the only 
portable fix there would be to add more casts.

I think it could be useful to tighten the source code with respect to 
implicit integer conversions, using warnings such as -Wsign-conversion 
and -Wconversion as well as -Wformat-signedness.  There are surely 
hidden overflow or truncation issues similar to CVE-2025-12818 hidden 
somewhere.  But explicit casts defeat those warnings, so removing 
unnecessary casts is a good step on the way there.




Re: Remove useless casting to the same type

От
Bertrand Drouvot
Дата:
Hi,

On Fri, Nov 28, 2025 at 09:11:16AM +0100, Peter Eisentraut wrote:
> On 25.11.25 06:46, Bertrand Drouvot wrote:
> > > > @@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record)
> > > > 
> > > >    /* extract low and high masks. */
> > > >    memcpy(&lowmask, data, sizeof(uint32));
> > > > - highmask = (uint32 *) ((char *) data + sizeof(uint32));
> > > > + highmask = (uint32 *) (data + sizeof(uint32));
> > > I wonder about these, too. I like knowing what the code does without
> > > having to check the type of `data`. But then later on we do a `data +=
> > > sizeof(uint32) * 2`, so you have to check the type anyway, so... I
> > > don't know.
> > I think that even with the cast in place, it's good to check the type of data.
> > Not for the line that follows (i.e: "data += sizeof(uint32) * 2") but to check
> > that the cast makes sense and does not hide "wrong" pointer manipulation.
> > 
> > So I think that with or without the cast one would need to check. But that feels
> > more natural to check when there is no cast (as we don't assume that someone
> > said "I know what I'm doing"). So I'm in favor of removing the cast, thoughts?
> 
> I think this whole thing could be simplified by overlaying a uint32 over
> "data" and just accessing the array fields normally.  See attached patch.

Indeed, that's a nice simplification.

-                       data += sizeof(uint32) * 2;

Is it safe? I mean could XLH_SPLIT_META_UPDATE_MASKS and XLH_SPLIT_META_UPDATE_SPLITPOINT
be set simultaneously?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Remove useless casting to the same type

От
Peter Eisentraut
Дата:
On 28.11.25 10:06, Bertrand Drouvot wrote:
> Hi,
> 
> On Fri, Nov 28, 2025 at 09:11:16AM +0100, Peter Eisentraut wrote:
>> On 25.11.25 06:46, Bertrand Drouvot wrote:
>>>>> @@ -389,7 +389,7 @@ hash_xlog_split_allocate_page(XLogReaderState *record)
>>>>>
>>>>>     /* extract low and high masks. */
>>>>>     memcpy(&lowmask, data, sizeof(uint32));
>>>>> - highmask = (uint32 *) ((char *) data + sizeof(uint32));
>>>>> + highmask = (uint32 *) (data + sizeof(uint32));
>>>> I wonder about these, too. I like knowing what the code does without
>>>> having to check the type of `data`. But then later on we do a `data +=
>>>> sizeof(uint32) * 2`, so you have to check the type anyway, so... I
>>>> don't know.
>>> I think that even with the cast in place, it's good to check the type of data.
>>> Not for the line that follows (i.e: "data += sizeof(uint32) * 2") but to check
>>> that the cast makes sense and does not hide "wrong" pointer manipulation.
>>>
>>> So I think that with or without the cast one would need to check. But that feels
>>> more natural to check when there is no cast (as we don't assume that someone
>>> said "I know what I'm doing"). So I'm in favor of removing the cast, thoughts?
>>
>> I think this whole thing could be simplified by overlaying a uint32 over
>> "data" and just accessing the array fields normally.  See attached patch.
> 
> Indeed, that's a nice simplification.
> 
> -                       data += sizeof(uint32) * 2;
> 
> Is it safe? I mean could XLH_SPLIT_META_UPDATE_MASKS and XLH_SPLIT_META_UPDATE_SPLITPOINT
> be set simultaneously?

Yes, that's what was probably intended.  But apparently not exercised in 
the tests.

So maybe more like this patch.

Вложения

Re: Remove useless casting to the same type

От
Bertrand Drouvot
Дата:
Hi,

On Fri, Nov 28, 2025 at 02:20:25PM +0100, Peter Eisentraut wrote:
> On 28.11.25 10:06, Bertrand Drouvot wrote:
> > On Fri, Nov 28, 2025 at 09:11:16AM +0100, Peter Eisentraut wrote:
> > > I think this whole thing could be simplified by overlaying a uint32 over
> > > "data" and just accessing the array fields normally.  See attached patch.
> > 
> > Indeed, that's a nice simplification.
> > 
> > -                       data += sizeof(uint32) * 2;
> > 
> > Is it safe? I mean could XLH_SPLIT_META_UPDATE_MASKS and XLH_SPLIT_META_UPDATE_SPLITPOINT
> > be set simultaneously?
> 
> Yes, that's what was probably intended.  But apparently not exercised in the
> tests.
> 
> So maybe more like this patch.

+   uint32          lowmask = uidata[uidatacount++];
+   uint32          highmask = uidata[uidatacount++];

good idea! That way that's easier to add more branches/flags later if we need
to.

Also, I think that's safe thanks to XLogRecGetBlockData() returning a
MAXALIGNed buffer. Not sure if it's worth to add a comment. I think that a
comment was not needed with the original code as it was using memcpy() instead.

Except for the nit comment remark above, LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Remove useless casting to the same type

От
Jacob Champion
Дата:
On Mon, Nov 24, 2025 at 9:46 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
> I think that even with the cast in place, it's good to check the type of data.
> Not for the line that follows (i.e: "data += sizeof(uint32) * 2") but to check
> that the cast makes sense and does not hide "wrong" pointer manipulation.

From a specifier standpoint ("accidentally got rid of const"), yes.
From a pointer math standpoint, `(char *) ptr + offset` has a specific
meaning to me, and I was arguing that hiding that meaning isn't
necessarily useful in all cases.

But! Peter's followup makes this moot. Which is good, and it might
serve as its own blueprint for when we next see a situation like this.

> Yeah, "*(s->str + pos)" is already used in SetAt() and also in IsVowel(). Instead
> of changing those 3, I'd prefer to keep the current change and keep the patch
> focus on its intend. We could change those in a dedicated patch afterward if we
> feel the need.

Fine by me. Thanks for this cleanup work!

--Jacob



Re: Remove useless casting to the same type

От
Jacob Champion
Дата:
On Fri, Nov 28, 2025 at 12:41 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> I test this once in a while and fix the issues that I find.  But it's
> very picky and you will find difficult to fix problems like the fact
> that the signedness of enums is implementation-defined, and so the only
> portable fix there would be to add more casts.

Once you've gotten into that situation, don't you have potential
sign-extension issues during int promotion, which require casts
anyway? It might be nice to fix those up regardless (he said, from his
armchair). But that doesn't seem as pressing as the other potential
problems.

> I think it could be useful to tighten the source code with respect to
> implicit integer conversions, using warnings such as -Wsign-conversion
> and -Wconversion as well as -Wformat-signedness.  There are surely
> hidden overflow or truncation issues similar to CVE-2025-12818 hidden
> somewhere.  But explicit casts defeat those warnings, so removing
> unnecessary casts is a good step on the way there.

+1. (And your v2 looks good to me.)

--Jacob



Re: Remove useless casting to the same type

От
Peter Eisentraut
Дата:
On 24.11.25 11:26, Bertrand Drouvot wrote:
> Hi hackers,
> 
> Attached is a patch to $SUBJECT.
> 
> This is the same kind of idea as 7f798aca1d5 and ef8fe693606, as their presence
> could cause risks of hiding actual type mismatches in the future or silently
> discarding qualifiers. I think that it also improves readability.
> 
> Those have been found with a coccinelle script as:
> 
> "
> @@
> type T;
> T E;
> @@
> 
> - (T)
>    E
> "
> 
> which removes casts when it's casting to the same type.
> 
> Note that it generated more that what is in the attached. I chose not to remove
> some of them (mainly arithmetic ones) to keep the patch focused on what matters
> here.

Committed.  I made some cosmetic adjustments, to remove line breaks that 
are no longer necessary with the shortened lines, and in gbt_num_union() 
I kept &out[0] (instead of just out) for consistency with the following 
line.




Re: Remove useless casting to the same type

От
Bertrand Drouvot
Дата:
Hi,

On Mon, Dec 01, 2025 at 09:50:43AM -0800, Jacob Champion wrote:
> Fine by me.

Thanks for the review!

> Thanks for this cleanup work!

That's fun and useful to do!

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com