Обсуждение: test_json_parser/002_inline is kind of slow
I've noticed that when I run 'meson test', the test mentioned in the subject line is usually the last one to finish. The test runs for 22 seconds on my machine, which is fairly high considering that 'meson test' in total (and with MESON_TESTTHREADS=8) runs for 3 minutes and 13 seconds. I think the reason for this relatively high runtime is that it fires off a separate shell command for each separate test, and there are 3400 of those. I'm not exactly sure what change to propose, but I wonder if we could come up with a way of making this a bit more efficient? Basically anything that would allow us to do multiple tests without having to fork a new process for every single one seems like it would probably save quite a bit. -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Sep 26, 2025 at 11:11:52AM -0400, Robert Haas wrote: > I've noticed that when I run 'meson test', the test mentioned in the > subject line is usually the last one to finish. The test runs for 22 > seconds on my machine, which is fairly high considering that 'meson > test' in total (and with MESON_TESTTHREADS=8) runs for 3 minutes and > 13 seconds. I think the reason for this relatively high runtime is > that it fires off a separate shell command for each separate test, and > there are 3400 of those. I'm not exactly sure what change to propose, > but I wonder if we could come up with a way of making this a bit more > efficient? Basically anything that would allow us to do multiple tests > without having to fork a new process for every single one seems like > it would probably save quite a bit. For some slow tests, we just have meson start it earlier with something like this: + 'test_kwargs': {'priority': 40}, At least, that's enough to prevent it from completing last on my machine. But actually improving the efficiency of the test seems like a better long-term fix. -- nathan
On Fri, Sep 26, 2025 at 8:12 AM Robert Haas <robertmhaas@gmail.com> wrote: > Basically anything that would allow us to do multiple tests > without having to fork a new process for every single one seems like > it would probably save quite a bit. Agreed, it's pretty inefficient. (For comparison, it runs in about two seconds on my machine, so I hadn't noticed how bad it had gotten. Thanks for pointing it out.) The hammer I had available at the time was Perl-plus-compiled-executable. But I've just recently added a C-based TAP suite in 4e1e417330d4, so I could try to port this suite over in that direction. (Or I could find a way to port it to pytest... but C seems like a better fit for testing the parser, in part because it'll be more easily fuzzable that way.) --Jacob
Hi, On 2025-09-26 10:25:19 -0500, Nathan Bossart wrote: > On Fri, Sep 26, 2025 at 11:11:52AM -0400, Robert Haas wrote: > > I've noticed that when I run 'meson test', the test mentioned in the > > subject line is usually the last one to finish. The test runs for 22 > > seconds on my machine, which is fairly high considering that 'meson > > test' in total (and with MESON_TESTTHREADS=8) runs for 3 minutes and > > 13 seconds. I think the reason for this relatively high runtime is > > that it fires off a separate shell command for each separate test, and > > there are 3400 of those. I'm not exactly sure what change to propose, > > but I wonder if we could come up with a way of making this a bit more > > efficient? Basically anything that would allow us to do multiple tests > > without having to fork a new process for every single one seems like > > it would probably save quite a bit. > > For some slow tests, we just have meson start it earlier with something > like this: > > + 'test_kwargs': {'priority': 40}, > > At least, that's enough to prevent it from completing last on my machine. > But actually improving the efficiency of the test seems like a better > long-term fix. It never finishes last here (linux, 20 cores) - and the overall runtime isn't that high, compared to the ones that finish last here [1] or are the slowest ones [2]. I'm not saying we shouldn't work on the efficiency of the test - forking thousands of times is pretty obviously absurdly inefficient - just that re-prioritizing doesn't seem that promising. Greetings, Andres Freund [1] 343/352 postgresql:test_json_parser / test_json_parser/002_inline OK 9.38s 3400 subtests passed 344/352 postgresql:ssl / ssl/001_ssltests OK 7.96s 247 subtests passed 345/352 postgresql:initdb / initdb/001_initdb OK 34.42s 57 subtests passed 346/352 postgresql:test_aio / test_aio/001_aio OK 11.71s 583 subtests passed 347/352 postgresql:pg_upgrade / pg_upgrade/006_transfer_modes OK 46.64s 60 subtests passed 348/352 postgresql:regress / regress/regress OK 57.33s 229 subtests passed 349/352 postgresql:pg_dump / pg_dump/002_pg_dump OK 45.00s 14236 subtests passed 350/352 postgresql:isolation / isolation/isolation OK 64.19s 122 subtests passed 351/352 postgresql:recovery / recovery/027_stream_regress OK 79.96s 11 subtests passed 352/352 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade OK 82.04s 19 subtests passed [2] 262/352 postgresql:pg_amcheck / pg_amcheck/003_check OK 19.46s 75 subtests passed 326/352 postgresql:pgbench / pgbench/001_pgbench_with_server OK 19.54s 467 subtests passed 297/352 postgresql:pg_basebackup / pg_basebackup/010_pg_basebackup OK 24.56s 200 subtests passed 345/352 postgresql:initdb / initdb/001_initdb OK 34.42s 57 subtests passed 349/352 postgresql:pg_dump / pg_dump/002_pg_dump OK 45.00s 14236 subtests passed 347/352 postgresql:pg_upgrade / pg_upgrade/006_transfer_modes OK 46.64s 60 subtests passed 348/352 postgresql:regress / regress/regress OK 57.33s 229 subtests passed 350/352 postgresql:isolation / isolation/isolation OK 64.19s 122 subtests passed 351/352 postgresql:recovery / recovery/027_stream_regress OK 79.96s 11 subtests passed 352/352 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade OK 82.04s 19 subtests passed
Hi, On 2025-09-26 08:33:08 -0700, Jacob Champion wrote: > On Fri, Sep 26, 2025 at 8:12 AM Robert Haas <robertmhaas@gmail.com> wrote: > > Basically anything that would allow us to do multiple tests > > without having to fork a new process for every single one seems like > > it would probably save quite a bit. > > Agreed, it's pretty inefficient. (For comparison, it runs in about two > seconds on my machine, so I hadn't noticed how bad it had gotten. > Thanks for pointing it out.) > > The hammer I had available at the time was > Perl-plus-compiled-executable. You can just support running multiple tests with one run of the executable, e.g. by splitting the input / output on null bytes. Greetings, Andres Freund
On Fri, Sep 26, 2025 at 11:33 AM Jacob Champion <jacob.champion@enterprisedb.com> wrote: > Agreed, it's pretty inefficient. (For comparison, it runs in about two > seconds on my machine, so I hadn't noticed how bad it had gotten. > Thanks for pointing it out.) Interesting. Are you on Linux? I'm testing directly on macOS, and I wonder now whether macOS is comparatively slower at starting new processes... -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Sep 26, 2025 at 8:38 AM Andres Freund <andres@anarazel.de> wrote: > You can just support running multiple tests with one run of the executable, > e.g. by splitting the input / output on null bytes. Yeah, but that doubles down on the bad unit test architecture... and I don't think anyone would want to touch that code after I wrote it. But it could be a quicker fix, especially if people are nervous about moving to C+TAP. --Jacob
On Fri, Sep 26, 2025 at 8:48 AM Robert Haas <robertmhaas@gmail.com> wrote: > Interesting. Are you on Linux? Yeah, this is aarch64 Ubuntu on an M3 Pro. --Jacob
On Fri, Sep 26, 2025 at 11:50 AM Jacob Champion <jacob.champion@enterprisedb.com> wrote: > On Fri, Sep 26, 2025 at 8:38 AM Andres Freund <andres@anarazel.de> wrote: > > You can just support running multiple tests with one run of the executable, > > e.g. by splitting the input / output on null bytes. > > Yeah, but that doubles down on the bad unit test architecture... and I > don't think anyone would want to touch that code after I wrote it. But > it could be a quicker fix, especially if people are nervous about > moving to C+TAP. I'm not bothered by C+TAP. The thing that I'm not crazy about with Python+TAP is that I fear that eventually everyone will have to know both enough Perl and enough Python to be able to write and maintain TAP tests, and I'm not that keen about that. I realize there may be no real alternative given the fading of Perl, but the requirements for people to hack on PostgreSQL are already quite high and I'd like to do whatever we reasonably can to avoid further elevating them. Replacing Perl with Python would probably lower the requirement overall even though it would be a painful adjustment for me personally, but the long-to-indefinite period of coexistence is a bummer. But none of that argument applies to having C produce TAP output. Everyone who wants to hack on PostgreSQL needs to know C anyway. -- Robert Haas EDB: http://www.enterprisedb.com
Hi, On 2025-09-26 08:49:51 -0700, Jacob Champion wrote: > On Fri, Sep 26, 2025 at 8:38 AM Andres Freund <andres@anarazel.de> wrote: > > You can just support running multiple tests with one run of the executable, > > e.g. by splitting the input / output on null bytes. > > Yeah, but that doubles down on the bad unit test architecture... and I > don't think anyone would want to touch that code after I wrote it. I don't really understand - what I'm proposing should just be a few lines? Splitting on some separator is hardly hard to understand code. > But it could be a quicker fix, especially if people are nervous about moving > to C+TAP. I don't have a problem with that in general, although I suspect we'd want some fe_utils (or such) helper for handling the tap bits centrally. I'm not convinced that it's the right thing to pursue here, and I'm wholly unconvinced by the reasoning. ISTM that the test vectors and the patttern matching on them in 002_inline.pl makes much more sense in a scripting language than in a C file. Greetings, Andres Freund
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, Sep 26, 2025 at 11:33 AM Jacob Champion > <jacob.champion@enterprisedb.com> wrote: >> Agreed, it's pretty inefficient. (For comparison, it runs in about two >> seconds on my machine, so I hadn't noticed how bad it had gotten. >> Thanks for pointing it out.) > Interesting. Are you on Linux? I'm testing directly on macOS, and I > wonder now whether macOS is comparatively slower at starting new > processes... Something else going on, perhaps? With PROVE_FLAGS=--timer, I see this on my RHEL8 workstation: [11:55:32] t/001_test_json_parser_incremental.pl .. ok 475 ms ( 0.02 usr 0.00 sys + 0.26 cusr 0.25 csys = 0.53 CPU) [11:55:32] t/002_inline.pl ........................ ok 2883 ms ( 0.11 usr 0.01 sys + 1.40 cusr 1.70 csys = 3.22 CPU) [11:55:35] t/003_test_semantic.pl ................. ok 70 ms ( 0.00 usr 0.00 sys + 0.06 cusr 0.02 csys = 0.08 CPU) [11:55:35] t/004_test_parser_perf.pl .............. ok 57 ms ( 0.00 usr 0.00 sys + 0.05 cusr 0.00 csys = 0.05 CPU) [11:55:35] All tests successful. Files=4, Tests=3930, 3 wallclock secs ( 0.14 usr 0.01 sys + 1.77 cusr 1.97 csys = 3.89 CPU) Result: PASS and this on my MacBook Pro M4 (Sequoia 15.6.1): [11:56:06] t/001_test_json_parser_incremental.pl .. ok 571 ms ( 0.01 usr 0.00 sys + 0.24 cusr 0.28 csys = 0.53 CPU) [11:56:06] t/002_inline.pl ........................ ok 3544 ms ( 0.06 usr 0.02 sys + 1.43 cusr 1.86 csys = 3.37 CPU) [11:56:10] t/003_test_semantic.pl ................. ok 54 ms ( 0.00 usr 0.00 sys + 0.03 cusr 0.02 csys = 0.05 CPU) [11:56:10] t/004_test_parser_perf.pl .............. ok 39 ms ( 0.00 usr 0.00 sys + 0.03 cusr 0.01 csys = 0.04 CPU) [11:56:10] All tests successful. Files=4, Tests=3930, 4 wallclock secs ( 0.08 usr 0.02 sys + 1.73 cusr 2.17 csys = 4.00 CPU) Result: PASS The M4 is about 2.3x the speed of the workstation according to geekbench, so macOS is not covering itself with glory here, but it's sure not 10x worse than Linux. Maybe you've got a Perl installation with debugging turned on, or something? Having said that, even three-ish seconds is an annoying contribution to check-world, so let's see if there's not some low-hanging fruit here. regards, tom lane
On Fri, Sep 26, 2025 at 9:01 AM Andres Freund <andres@anarazel.de> wrote: > On 2025-09-26 08:49:51 -0700, Jacob Champion wrote: > > Yeah, but that doubles down on the bad unit test architecture... and I > > don't think anyone would want to touch that code after I wrote it. > > I don't really understand - what I'm proposing should just be a few lines? > Splitting on some separator is hardly hard to understand code. If it's truly just a few lines, then I've misunderstood what you're suggesting (patches welcome). It's not the separator splitting that I'm worried about, but the restructuring of the test. > ISTM that the test vectors and the patttern matching on them in 002_inline.pl > makes much more sense in a scripting language than in a C file. It would be on me to prove it, but I don't recall the patterns really being used all that much. And the inputs are all constant strings. Thanks, --Jacob
On Fri, Sep 26, 2025 at 9:26 AM Jacob Champion <jacob.champion@enterprisedb.com> wrote: > If it's truly just a few lines, then I've misunderstood what you're > suggesting (patches welcome). It's not the separator splitting that > I'm worried about, but the restructuring of the test. Here is a very slapdash attempt at pushing the "chunk size iteration" part of the tests down into the test executable, using null separators. I think it's ugly, but maybe not quite as bad as I feared. Unfortunately it only gives about a 4x speedup on my machine, and I was hoping for much more. (We should really expect this entire thing to run in a fraction of a second.) Let me know if you think the tradeoff is worth it for now; I can polish it up if so. Thanks, --Jacob
Вложения
On 2025-09-26 Fr 8:06 PM, Jacob Champion wrote: > On Fri, Sep 26, 2025 at 9:26 AM Jacob Champion > <jacob.champion@enterprisedb.com> wrote: >> If it's truly just a few lines, then I've misunderstood what you're >> suggesting (patches welcome). It's not the separator splitting that >> I'm worried about, but the restructuring of the test. > Here is a very slapdash attempt at pushing the "chunk size iteration" > part of the tests down into the test executable, using null > separators. I think it's ugly, but maybe not quite as bad as I feared. > Unfortunately it only gives about a 4x speedup on my machine, and I > was hoping for much more. (We should really expect this entire thing > to run in a fraction of a second.) > > Let me know if you think the tradeoff is worth it for now; I can > polish it up if so. > On my (Linux) test it went from 3.75s to 0.78s, nearly 80% reduction. I think the reduction on Windows is likely to be more. So I think this is worth doing. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > On 2025-09-26 Fr 8:06 PM, Jacob Champion wrote: >> Unfortunately it only gives about a 4x speedup on my machine, and I >> was hoping for much more. (We should really expect this entire thing >> to run in a fraction of a second.) >> >> Let me know if you think the tradeoff is worth it for now; I can >> polish it up if so. > On my (Linux) test it went from 3.75s to 0.78s, nearly 80% reduction. I > think the reduction on Windows is likely to be more. So I think this is > worth doing. I tried it on my Mac M4 laptop, and saw the 002 test dropping from 3592 ms to 540 ms, about an 85% savings. So I agree this is worth doing. It's still unclear why Robert is seeing times so much worse than mine, though. I didn't read the patch too carefully, but the changes in the 002_inline.pl script itself seem reasonable enough. regards, tom lane
On Sat, Sep 27, 2025 at 12:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I tried it on my Mac M4 laptop, and saw the 002 test dropping from > 3592 ms to 540 ms, about an 85% savings. So I agree this is worth > doing. It's still unclear why Robert is seeing times so much worse > than mine, though. I don't have a clear explanation for that slowness either, but the patch cuts down the runtime by 89% on my machine. (I have not looked at the code.) -- Robert Haas EDB: http://www.enterprisedb.com
On Mon, Sep 29, 2025 at 8:11 AM Robert Haas <robertmhaas@gmail.com> wrote: > I don't have a clear explanation for that slowness either, but the > patch cuts down the runtime by 89% on my machine. Nice! I'll clean this up, and put a rewrite on the back burner for now. Thanks everyone for the testing, and thanks Andres for the idea! --Jacob
Robert Haas <robertmhaas@gmail.com> writes: > On Sat, Sep 27, 2025 at 12:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I tried it on my Mac M4 laptop, and saw the 002 test dropping from >> 3592 ms to 540 ms, about an 85% savings. So I agree this is worth >> doing. It's still unclear why Robert is seeing times so much worse >> than mine, though. > I don't have a clear explanation for that slowness either, but the > patch cuts down the runtime by 89% on my machine. Over the weekend I updated my macOS buildfarm animals to more recent OS versions, and noticed this: sifaka (Mac Mini M4 Pro), on macOS 15 (Sequoia) [1] [06:50:39] t/001_test_json_parser_incremental.pl .. ok 868 ms ( 0.01 usr 0.00 sys + 0.33 cusr 0.48 csys = 0.82 CPU) [06:50:44] t/002_inline.pl ........................ ok 5485 ms ( 0.06 usr 0.02 sys + 2.04 cusr 3.20 csys = 5.32 CPU) [06:50:44] t/003_test_semantic.pl ................. ok 64 ms ( 0.00 usr 0.00 sys + 0.04 cusr 0.02 csys = 0.06 CPU) [06:50:44] t/004_test_parser_perf.pl .............. ok 41 ms ( 0.00 usr 0.00 sys + 0.03 cusr 0.01 csys = 0.04 CPU) sifaka (Mac Mini M4 Pro), on macOS 26 (Tahoe) [2] [02:42:28] t/001_test_json_parser_incremental.pl .. ok 882 ms ( 0.01 usr 0.00 sys + 0.33 cusr 0.48 csys = 0.82 CPU) [02:42:33] t/002_inline.pl ........................ ok 5573 ms ( 0.06 usr 0.02 sys + 2.06 cusr 3.18 csys = 5.32 CPU) [02:42:34] t/003_test_semantic.pl ................. ok 66 ms ( 0.00 usr 0.00 sys + 0.04 cusr 0.02 csys = 0.06 CPU) [02:42:34] t/004_test_parser_perf.pl .............. ok 42 ms ( 0.00 usr 0.00 sys + 0.03 cusr 0.01 csys = 0.04 CPU) longfin (2018 Mac Mini, Intel-based), on macOS 13 (Ventura) [3] [07:02:37] t/001_test_json_parser_incremental.pl .. ok 2355 ms ( 0.05 usr 0.01 sys + 1.21 cusr 1.05 csys = 2.32 CPU) [07:02:52] t/002_inline.pl ........................ ok 14844 ms ( 0.29 usr 0.04 sys + 7.38 cusr 7.02 csys = 14.73 CPU) [07:02:52] t/003_test_semantic.pl ................. ok 187 ms ( 0.00 usr 0.00 sys + 0.13 cusr 0.06 csys = 0.19 CPU) [07:02:52] t/004_test_parser_perf.pl .............. ok 144 ms ( 0.01 usr 0.00 sys + 0.11 cusr 0.03 csys = 0.15 CPU) longfin (2018 Mac Mini, Intel-based), on macOS 15 (Sequoia) [4] [02:39:51] t/001_test_json_parser_incremental.pl .. ok 4005 ms ( 0.06 usr 0.00 sys + 1.86 cusr 1.92 csys = 3.84 CPU) [02:40:16] t/002_inline.pl ........................ ok 25139 ms ( 0.33 usr 0.04 sys + 11.47 cusr 12.23 csys = 24.07 CPU) [02:40:17] t/003_test_semantic.pl ................. ok 244 ms ( 0.00 usr 0.01 sys + 0.15 cusr 0.09 csys = 0.25 CPU) [02:40:17] t/004_test_parser_perf.pl .............. ok 153 ms ( 0.00 usr 0.00 sys + 0.11 cusr 0.03 csys = 0.14 CPU) longfin is intentionally an OS release behind sifaka, so these numbers aren't directly comparable. However, the previous speed ratio of about 2.75x is in line with the two machines' single-core performance difference according to geekbench; the new ratio of 4.5x is not. I conclude that Apple broke something about the speed of process forking in the last OS release or two, but only on their Intel machines. I doubt that they care a whole lot, given they're about to stop supporting Intel hardware altogether :-( Anyway, these numbers prompt me to wonder exactly which macOS release Robert is running, and on what kind of hardware. regards, tom lane [1] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=sifaka&dt=2025-09-27%2010%3A42%3A05&stg=module-test_json_parser-check [2] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=sifaka&dt=2025-09-29%2006%3A34%3A17&stg=module-test_json_parser-check [3] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=longfin&dt=2025-09-27%2010%3A42%3A01&stg=module-test_json_parser-check [4] https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=longfin&dt=2025-09-29%2006%3A20%3A01&stg=module-test_json_parser-check
On Mon, Sep 29, 2025 at 8:24 AM Jacob Champion <jacob.champion@enterprisedb.com> wrote: > Nice! I'll clean this up, and put a rewrite on the back burner for > now. Barring objections, I'll plan to backpatch the attached in a day or so. Thanks, --Jacob
Вложения
On Mon, Sep 29, 2025 at 12:37 PM Jacob Champion <jacob.champion@enterprisedb.com> wrote: > Barring objections, I'll plan to backpatch the attached in a day or so. Committed. --Jacob