Re: WIP Incremental JSON Parser
От | Andrew Dunstan |
---|---|
Тема | Re: WIP Incremental JSON Parser |
Дата | |
Msg-id | 98787c69-7581-456a-b726-fc93416f0b15@dunslane.net обсуждение исходный текст |
Ответ на | Re: WIP Incremental JSON Parser (Andrew Dunstan <andrew@dunslane.net>) |
Ответы |
Re: WIP Incremental JSON Parser
(Michael Paquier <michael@paquier.xyz>)
|
Список | pgsql-hackers |
On 2024-04-09 Tu 15:42, Andrew Dunstan wrote: > > On 2024-04-09 Tu 07:54, Andrew Dunstan wrote: >> >> >> On 2024-04-09 Tu 01:23, Michael Paquier wrote: >>> On Tue, Apr 09, 2024 at 09:48:18AM +0900, Michael Paquier wrote: >>>> There is no direct check on test_json_parser_perf.c, either, only a >>>> custom rule in the Makefile without specifying something for meson. >>>> So it looks like you could do short execution check in a TAP test, at >>>> least. >> >> >> Not adding a test for that was deliberate - any sane test takes a >> while, and I didn't want to spend that much time on it every time >> someone runs "make check-world" or equivalent. However, adding a test >> to run it with a trivial number of iterations seems reasonable, so >> I'll add that. I'll also add a meson target for the binary. >> >> >>> While reading the code, I've noticed a few things, as well: >>> >>> + /* max delicious line length is less than this */ >>> + char buff[6001]; >>> >>> Delicious applies to the contents, nothing to do with this code even >>> if I'd like to think that these lines of code are edible and good. >>> Using a hardcoded limit for some JSON input sounds like a bad idea to >>> me even if this is a test module. Comment that applies to both the >>> perf and the incremental tools. You could use a #define'd buffer size >>> for readability rather than assuming this value in many places. >> >> >> The comment is a remnant of when I hadn't yet added support for >> incomplete tokens, and so I had to parse the input line by line. I >> agree it can go, and we can use a manifest constant for the buffer size. >> >> >>> +++ b/src/test/modules/test_json_parser/test_json_parser_incremental.c >>> + * This progam tests incremental parsing of json. The input is fed >>> into >>> + * full range of incement handling, especially in the lexer, is >>> exercised. >>> +++ b/src/test/modules/test_json_parser/test_json_parser_perf.c >>> + * Performancet est program for both flavors of the JSON parser >>> + * This progam tests either the standard (recursive descent) JSON >>> parser >>> +++ b/src/test/modules/test_json_parser/README >>> + reads in a file and pases it in very small chunks (60 bytes at a >>> time) to >>> >>> Collection of typos across various files. >> >> >> Will fix. (The older I get the more typos I seem to make and the >> harder it is to notice them. It's very annoying.) >> >> >>> + appendStringInfoString(&json, "1+23 trailing junk"); >>> What's the purpose here? Perhaps the intention should be documented >>> in a comment? >> >> >> The purpose is to ensure that if there is not a trailing '\0' on the >> json chunk the parser will still do the right thing. I'll add a >> comment to that effect. >> >> >>> At the end, having a way to generate JSON blobs randomly to test this >>> stuff would be more appealing than what you have currently, with >>> perhaps a few factors like: >>> - Array and simple object density. >>> - Max Level of nesting. >>> - Limitation to ASCII characters that can be escaped. >>> - Perhaps more things I cannot think about? >> >> >> No, I disagree. Maybe we need those things as well, but we do need a >> static test where we can test the output against known results. I >> have no objection to changing the input and output files. >> >> It's worth noting that t/002_inline.pl does generate some input and >> test e.g., the maximum nesting levels among other errors. Perhaps you >> missed that. If you think we need more tests there adding them would >> be extremely simple. >> >> >>> So the current state of things is kind of disappointing, and the size >>> of the data set added to the tree is not that portable either if you >>> want to test various scenarios depending on the data set. It seems to >>> me that this has been committed too hastily and that this is not ready >>> for integration, even if that's just a test module. Tom also has >>> shared some concerns upthread, as far as I can see. >> >> >> I have posted a patch already that addresses the issue Tom raised. >> >> > > > Here's a consolidated set of cleanup patches, including the memory > leak patch and Jacob's shrink-tiny patch. Here's v2 of the cleanup patch 4, that fixes some more typos kindly pointed out to me by Alexander Lakhin. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Richard GuoДата:
Сообщение: [MASSMAIL]Revise some error messages in split partition code
Следующее
От: Etsuro FujitaДата:
Сообщение: [MASSMAIL]Comment about handling of asynchronous requests in postgres_fdw.c