Re: pg should ignore u+200b zero width space
От | Heikki Linnakangas |
---|---|
Тема | Re: pg should ignore u+200b zero width space |
Дата | |
Msg-id | c6037756-6446-79be-8c3d-cb7d55c30cb0@iki.fi обсуждение исходный текст |
Ответ на | Re: pg should ignore u+200b zero width space (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: pg should ignore u+200b zero width space
|
Список | pgsql-bugs |
On 03/11/2020 16:52, Tom Lane wrote: > Heikki Linnakangas <hlinnaka@iki.fi> writes: >> On 03/11/2020 15:41, James Cloos wrote: >>> pg should treat a no break space after whitespace as just more >>> whitespace. > >> Hmm. I'm not sure if change the behavior is a good idea, but a hint in >> the error message would be nice. Something like: > > The difficulty with doing anything in this space --- whether it be > ignoring, throwing an error, or whatever --- is that it makes the > lexer's behavior encoding-sensitive and potentially locale-sensitive. > That's problematic for all sorts of reasons. One of the worst is > that frontend programs such as psql and ecpg also have SQL lexers, > and there'd be no way to keep their behavior in precise sync with > the backend's. (They might not even be running in the same encoding, > never mind locale.) It might even be possible to build security > holes around that; recall the fun we've had with trying to lock > down quoting rules in encodings where backslash can be part of a > multibyte character :-(. > > Perhaps it'd be all right to confine the change in behavior to > just modifying the error text in cases where we were going to > throw an error anyway. But I think this is much harder than > it sounds to do in a valid, safe way. Yeah, my thinking was to just add a hint when you're throwing a syntax error anyway. Something simple like check if client_encoding is utf8 and there is a U+200b in the query string, and add the hint if so. It doesn't need to catch all cases, and rare false positives are OK too. - Heikki
В списке pgsql-bugs по дате отправления: