Re: Avoid overflow (src/backend/utils/adt/formatting.c)
| От | Bryan Green | 
|---|---|
| Тема | Re: Avoid overflow (src/backend/utils/adt/formatting.c) | 
| Дата | |
| Msg-id | 808de7da-e9cb-4f79-a8f6-3f0cc3d031a9@gmail.com обсуждение исходный текст  | 
		
| Ответ на | Re: Avoid overflow (src/backend/utils/adt/formatting.c) (Bryan Green <dbryan.green@gmail.com>) | 
| Ответы | 
                	
            		Re: Avoid overflow (src/backend/utils/adt/formatting.c)
            		
            		 | 
		
| Список | pgsql-hackers | 
On 11/2/2025 10:09 AM, Bryan Green wrote:
> On 11/2/2025 8:59 AM, Ranier Vilela wrote:
>> Hi.
>>
>> Per Coverity.
>>
>> Coverity raised the follow report:
>>
>> CID 1642824: (#1 of 1): Overflowed constant (INTEGER_OVERFLOW)
>> 37. overflow_const: Expression pattern_len - 1ULL, where pattern_len is 
>> known to be equal to 0, underflows the type of pattern_len - 1ULL, which 
>> is type unsigned long long.
>>
>> This is because the function *pg_mbstrlen* can return zero.
>> And the comment is clear, *just in case there are MB chars*.
>>
>> patch attached.
>>
>> best regards,
>> Ranier Vilela
> 
> Thanks for finding and patching this.  I think you're absolutely
> right that there's a latent bug here, and your fix is appropriate.
> However, I wanted to share some thoughts on why this has probably
> never been reported in the wild, and why we should apply the patch
> anyway.
> 
> The crash requires a specific and rather unlikely combination of
> circumstances:
> 
> 1. You need a locale where lconv->thousands_sep is present but
>    empty.  Most locales either provide a real separator (",", ".",
>    or " ") or return NULL (in which case NUM_prepare_locale's
>    fallback logic provides a default).  The comments mention
>    "broken glibc pt_BR" as an example, but even that's been fixed
>    in most glibc versions for years now.
> 
> 2. You need a format string containing 'G', which requests
>    insertion of the locale's thousands separator.
> 
> Here's the thing: if your locale has no thousands separator, why
> would you write a format string asking for one?  Consider:
> 
>     -- In a locale with thousands_sep = ","
>     SELECT to_char(1234567.89, '9G999G999.99');
>     -- Result: "1,234,567.89"
> 
>     -- In a locale with empty thousands_sep
>     SELECT to_char(1234567.89, '9G999G999.99');
>     -- Result: "1234567.89"  (the G's insert nothing)
> 
> If you're working in a locale that doesn't have a thousands
> separator, you're most likely culturally aware of that fact,
> and you simply wouldn't include 'G' in your format strings.
> It would be like writing code to format something your locale
> doesn't have a concept of.
> 
> So I think the bug has survived this long because:
>   a) Very few locales have truly empty thousands_sep strings
>   b) Users of those locales naturally avoid the 'G' format code
>   c) Even accidental use would require the right fill-mode
>      settings
> 
> That said, in my opinion we should definitely apply your patch,
> for several reasons:
> 
> First, it's undefined behavior.  Backing up the pointer via
> "ptr += -1" when ptr is at the start of allocated memory is a
> serious bug, regardless of how unlikely the trigger condition is.
> We don't want that lurking in the codebase.
> 
> Second, even if this hasn't been triggered in 20+ years, it's
> exactly the kind of thing that AFL or similar tools might find.
> Better to fix it before we get a CVE filing.
> 
> Third, it's a trivial fix.  The performance impact is nil, and it
> makes the code more obviously correct.  There's really no
> downside.
> 
> I'd suggest one minor adjustment to your patch: add a comment
> explaining why we're checking for empty pattern, since it's
> non-obvious:
> 
>     /*
>      * Guard against empty separator (could happen with some
>      * locales)
>      */
>     if (pattern_len > 0)
>     {
>         ...
>     }
> 
> That'll help future maintainers understand this isn't just
> defensive programming paranoia.
> 
> Good catch on finding this. We'll have to see if others agree...
> 
> Bryan Green
I did a bit more analysis to double check myself and realized I missed
an important check:
NUM_prepare_locale() has a safety check:
    if (lconv->thousands_sep && *lconv->thousands_sep)
        Np->L_thousands_sep = lconv->thousands_sep;
    else if (strcmp(Np->decimal, ",") != 0)
        Np->L_thousands_sep = ",";
    else
        Np->L_thousands_sep = ".";
The "*lconv->thousands_sep" test specifically prevents empty
strings from being used, so Np->L_thousands_sep can never be empty in
production.
The patch is still trivial and it probably should still be applied.
Bryan Green
		
	В списке pgsql-hackers по дате отправления: