Re: WIP Incremental JSON Parser
| От | Andrew Dunstan |
|---|---|
| Тема | Re: WIP Incremental JSON Parser |
| Дата | |
| Msg-id | 1a4d8a4c-ca76-4d58-8d8e-608ad8c6e263@dunslane.net обсуждение исходный текст |
| Ответ на | Re: WIP Incremental JSON Parser (Andrew Dunstan <andrew@dunslane.net>) |
| Ответы |
Re: WIP Incremental JSON Parser
|
| Список | pgsql-hackers |
On 2024-04-09 Tu 07:54, Andrew Dunstan wrote: > > > On 2024-04-09 Tu 01:23, Michael Paquier wrote: >> On Tue, Apr 09, 2024 at 09:48:18AM +0900, Michael Paquier wrote: >>> There is no direct check on test_json_parser_perf.c, either, only a >>> custom rule in the Makefile without specifying something for meson. >>> So it looks like you could do short execution check in a TAP test, at >>> least. > > > Not adding a test for that was deliberate - any sane test takes a > while, and I didn't want to spend that much time on it every time > someone runs "make check-world" or equivalent. However, adding a test > to run it with a trivial number of iterations seems reasonable, so > I'll add that. I'll also add a meson target for the binary. > > >> While reading the code, I've noticed a few things, as well: >> >> + /* max delicious line length is less than this */ >> + char buff[6001]; >> >> Delicious applies to the contents, nothing to do with this code even >> if I'd like to think that these lines of code are edible and good. >> Using a hardcoded limit for some JSON input sounds like a bad idea to >> me even if this is a test module. Comment that applies to both the >> perf and the incremental tools. You could use a #define'd buffer size >> for readability rather than assuming this value in many places. > > > The comment is a remnant of when I hadn't yet added support for > incomplete tokens, and so I had to parse the input line by line. I > agree it can go, and we can use a manifest constant for the buffer size. > > >> +++ b/src/test/modules/test_json_parser/test_json_parser_incremental.c >> + * This progam tests incremental parsing of json. The input is fed into >> + * full range of incement handling, especially in the lexer, is exercised. >> +++ b/src/test/modules/test_json_parser/test_json_parser_perf.c >> + * Performancet est program for both flavors of the JSON parser >> + * This progam tests either the standard (recursive descent) JSON parser >> +++ b/src/test/modules/test_json_parser/README >> + reads in a file and pases it in very small chunks (60 bytes at a time) to >> >> Collection of typos across various files. > > > Will fix. (The older I get the more typos I seem to make and the > harder it is to notice them. It's very annoying.) > > >> + appendStringInfoString(&json, "1+23 trailing junk"); >> What's the purpose here? Perhaps the intention should be documented >> in a comment? > > > The purpose is to ensure that if there is not a trailing '\0' on the > json chunk the parser will still do the right thing. I'll add a > comment to that effect. > > >> At the end, having a way to generate JSON blobs randomly to test this >> stuff would be more appealing than what you have currently, with >> perhaps a few factors like: >> - Array and simple object density. >> - Max Level of nesting. >> - Limitation to ASCII characters that can be escaped. >> - Perhaps more things I cannot think about? > > > No, I disagree. Maybe we need those things as well, but we do need a > static test where we can test the output against known results. I have > no objection to changing the input and output files. > > It's worth noting that t/002_inline.pl does generate some input and > test e.g., the maximum nesting levels among other errors. Perhaps you > missed that. If you think we need more tests there adding them would > be extremely simple. > > >> So the current state of things is kind of disappointing, and the size >> of the data set added to the tree is not that portable either if you >> want to test various scenarios depending on the data set. It seems to >> me that this has been committed too hastily and that this is not ready >> for integration, even if that's just a test module. Tom also has >> shared some concerns upthread, as far as I can see. > > > I have posted a patch already that addresses the issue Tom raised. > > Here's a consolidated set of cleanup patches, including the memory leak patch and Jacob's shrink-tiny patch. I think the biggest open issue is whether or not we remove the performance test program. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления: