Re: jsonb, unicode escapes and escaped backslashes
От | Andrew Dunstan |
---|---|
Тема | Re: jsonb, unicode escapes and escaped backslashes |
Дата | |
Msg-id | 54C919D2.6040700@dunslane.net обсуждение исходный текст |
Ответ на | Re: jsonb, unicode escapes and escaped backslashes (Noah Misch <noah@leadboat.com>) |
Ответы |
Re: jsonb, unicode escapes and escaped backslashes
|
Список | pgsql-hackers |
On 01/28/2015 12:50 AM, Noah Misch wrote: > On Tue, Jan 27, 2015 at 03:56:22PM -0500, Tom Lane wrote: >> Andrew Dunstan <andrew@dunslane.net> writes: >>> On 01/27/2015 02:28 PM, Tom Lane wrote: >>>> Well, we can either fix it now or suffer with a broken representation >>>> forever. I'm not wedded to the exact solution I described, but I think >>>> we'll regret it if we don't change the representation. >> So at this point I propose that we reject \u0000 when de-escaping JSON. > I would have agreed on 2014-12-09, and this release is the last chance to make > such a change. It is a bold wager that could pay off, but -1 from me anyway. > I can already envision the blog post from the DBA staying on 9.4.0 because > 9.4.1 pulled his ability to store U+0000 in jsonb. jsonb was *the* top-billed > 9.4 feature, and this thread started with Andrew conveying a field report of a > scenario more obscure than storing U+0000. Therefore, we have to assume many > users will notice the change. This move would also add to the growing > evidence that our .0 releases are really beta(N+1) releases in disguise. > >> Anybody who's seriously unhappy with that can propose a patch to fix it >> properly in 9.5 or later. > Someone can still do that by introducing a V2 of the jsonb binary format and > preserving the ability to read both formats. (Too bad Andres's proposal to > include a format version didn't inform the final format, but we can wing it.) > I agree that storing U+0000 as 0x00 is the best end state. > > We need to make up our minds about this pretty quickly. The more radical move is likely to involve quite a bit of work, ISTM. It's not clear to me how we should represent a unicode null. i.e. given a json of '["foo\u0000bar"]', I get that we'd store the element as 'foo\x00bar', but what is the result of (jsonb '["foo\u0000bar"')->>0 It's defined to be text so we can't just shove a binary null in the middle of it. Do we throw an error? And I still want to hear more voices on the whole direction we want to take this. cheers andrew
В списке pgsql-hackers по дате отправления: