Обсуждение: [PATCH] Silence Valgrind about SelectConfigFiles()
Hi, I experimented with Valgrind after recent changes committed by Tom [1] and catched this: ``` 25 bytes in 1 blocks are definitely lost in loss record 18 of 49 at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so) by 0x57DE35E: strdup (strdup.c:42) by 0x9D9D1D: make_absolute_path (path.c:877) by 0x98AD5D: SelectConfigFiles (guc.c:1795) by 0x65F104: PostmasterMain (postmaster.c:785) by 0x52F2B9: main (main.c:231) ``` I propose to correct this as attached. [1]: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us
Вложения
Aleksander Alekseev <aleksander@tigerdata.com> writes: > I experimented with Valgrind after recent changes committed by Tom [1] > and catched this: > ... > I propose to correct this as attached. You didn't say what the test condition was, but from the patch I suppose it's a case where SelectConfigFiles is erroring out? Kind of hard to get excited about preventing a leak in that path, since we're going to exit immediately. regards, tom lane
Hi Tom, > You didn't say what the test condition was, but from the patch > I suppose it's a case where SelectConfigFiles is erroring out? Steps to reproduce: 1. Uncomment USE_VALGRIND in src/include/pg_config_manual.h, then rebuild and setup Postgres as usual 2. In the first terminal: ``` $ valgrind \ --leak-check=full \ --track-origins=yes \ --gen-suppressions=all \ --read-var-info=yes \ --log-file=/tmp/valgrind/%p.log \ --suppressions=src/tools/valgrind.supp \ --time-stamp=yes \ --trace-children=yes \ postgres -D /home/eax/pginstall/data \ 2>&1 | tee /tmp/valgrind/postmaster.log ``` 3. In the second terminal: ``` meson test -C build \ --setup running \ --suite regress-running \ --timeout-multiplier 0 pg_ctl -w -D /home/eax/pginstall/data stop grep -r 'ERROR SUMMARY' /tmp/valgrind/ | grep -v 'SUMMARY: 0 errors' ``` > Kind of hard to get excited about preventing a leak in that path, > since we're going to exit immediately. It's not exciting at all but given the fact that this is the only "definitely lost" error I got it's worth fixing IMO ("broken windows" and all that). I should have also pointed out that SelectConfigFiles() sometimes calls free(configdir) and sometimes doesn't, which is inconsistent. Sorry for little details in the first email - it's 2:30 AM in my time zone :)
Aleksander Alekseev <aleksander@tigerdata.com> writes: >> You didn't say what the test condition was, but from the patch >> I suppose it's a case where SelectConfigFiles is erroring out? > Steps to reproduce: > ... Huh ... don't quite see where in that recipe we'd reach a SelectConfigFiles error exit. > It's not exciting at all but given the fact that this is the only > "definitely lost" error I got it's worth fixing IMO ("broken windows" > and all that). I should have also pointed out that SelectConfigFiles() > sometimes calls free(configdir) and sometimes doesn't, which is > inconsistent. Yeah, I noticed that too, and it does offend my inner neatnik. Instead of what you did, I'd be inclined to add free(configdir); return true; + +fail: + free(configdir); + + return false; } and then s/return false/goto fail/ throughout, so as to avoid duplicating the free() calls. It's a minor point as things stand, but if more cleanup gets added to the function I think it'd be easier to maintain this way. > Sorry for little details in the first email - it's 2:30 AM in my time zone :) No trouble. regards, tom lane
Hi Tom, > Yeah, I noticed that too, and it does offend my inner neatnik. > > Instead of what you did, I'd be inclined to add > > free(configdir); > > return true; > + > +fail: > + free(configdir); > + > + return false; > } > > and then s/return false/goto fail/ throughout, so as to avoid > duplicating the free() calls. It's a minor point as things stand, > but if more cleanup gets added to the function I think it'd be > easier to maintain this way. Makes sense. Here is the corrected patch v2. > Huh ... don't quite see where in that recipe we'd reach a > SelectConfigFiles error exit. How exactly we reach this code patch is a good question. I tried to understand the exact conditions by using my steps to reproduce and an ancient debugging technique with sleep(), elog() and `watch` - see trick.txt. Unfortunately I was not able to reproduce it again nor under Valgrind nor without it. I guess it means that either I did something differently before or the right conditions are met under rare circumstances.
Вложения
Aleksander Alekseev <aleksander@tigerdata.com> writes: > Makes sense. Here is the corrected patch v2. LGTM, pushed. >> Huh ... don't quite see where in that recipe we'd reach a >> SelectConfigFiles error exit. > How exactly we reach this code patch is a good question. I tried to > understand the exact conditions by using my steps to reproduce and an > ancient debugging technique with sleep(), elog() and `watch` - see > trick.txt. Unfortunately I was not able to reproduce it again nor > under Valgrind nor without it. I guess it means that either I did > something differently before or the right conditions are met under > rare circumstances. I think you must have done something involving passing bad arguments to the postmaster, and then forgotten that (not hard to do late at night). I see from the code coverage report that none of the failure exits from SelectConfigFiles are exercised in our normal regression tests, so only something very odd could have resulted in getting here without a manual error being involved. But I'm content to apply this on the grounds that it's inconsistent to free the string in only one of the error exit paths. regards, tom lane