Re: [PATCH] json_lex_string: don't overread on bad UTF8
От | Michael Paquier |
---|---|
Тема | Re: [PATCH] json_lex_string: don't overread on bad UTF8 |
Дата | |
Msg-id | ZjL5Ed6LDZGDGILj@paquier.xyz обсуждение исходный текст |
Ответ на | Re: [PATCH] json_lex_string: don't overread on bad UTF8 (Jacob Champion <jacob.champion@enterprisedb.com>) |
Ответы |
Re: [PATCH] json_lex_string: don't overread on bad UTF8
|
Список | pgsql-hackers |
On Wed, May 01, 2024 at 04:22:24PM -0700, Jacob Champion wrote: > On Tue, Apr 30, 2024 at 11:09 PM Michael Paquier <michael@paquier.xyz> wrote: >> Not sure to like much the fact that this advances token_terminator >> first. Wouldn't it be better to calculate pg_encoding_mblen() first, >> then save token_terminator? I feel a bit uneasy about saving a value >> in token_terminator past the end of the string. That a nit in this >> context, still.. > > v2 tries it that way; see what you think. Is the concern that someone > might add code later that escapes that macro early? Yeah, I am not sure if that's something that would really happen, but that looks like a good practice to keep anyway to keep a clean stack at any time. >> Ah, that makes sense. That looks OK here. A comment around the test >> would be adapted to document that, I guess. > > Done. That seems OK at quick glance. I don't have much room to do something about this patch this week as an effect of Golden Week and the buildfarm effect, but I should be able to get to it next week once the next round of minor releases is tagged. About the fact that we may finish by printing unfinished UTF-8 sequences, I'd be curious to hear your thoughts. Now, the information provided about the partial byte sequences can be also useful for debugging on top of having the error code, no? -- Michael
Вложения
В списке pgsql-hackers по дате отправления: