Обсуждение: Add support for automatically updating Unicode derived files
Continuing the discussion from [0] and [1], here is a patch that automates the process of updating Unicode derived files. Summary: - Edit UNICODE_VERSION and/or CLDR_VERSION in src/Makefile.global.in - Run make update-unicode - Commit I have added that to the release checklist in RELEASE_NOTES. This also includes the script used in [0] that was not committed at that time. Other than that, this just refactors existing build code. Open questions that are currently not handled consistently: - Should the downloaded files be listed in .gitignore? - Should the downloaded files be cleaned by make clean (or distclean or maintainer-clean or none)? - Should the generated files be excluded from pgindent? Currently, the generated files will not pass pgindent unchanged, so that could cause annoying whitespace battles when these files are updated and re-indented around release time. [0]: https://www.postgresql.org/message-id/flat/bbb19114-af1e-513b-08a9-61272794bd5c%402ndquadrant.com [1]: https://www.postgresql.org/message-id/flat/77f69366-ca31-6437-079f-47fce69bae1b%402ndquadrant.com -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On Tue, Oct 29, 2019 at 6:06 AM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > > Continuing the discussion from [0] and [1], here is a patch that > automates the process of updating Unicode derived files. Summary: > > - Edit UNICODE_VERSION and/or CLDR_VERSION in src/Makefile.global.in > - Run make update-unicode > - Commit Hi Peter, I gave "make update-unicode" a try. It's unclear to me what the state of the build tree should be when a maintainer runs this, so I'll just report what happens when running naively (on MacOS). After only running configure, "make update-unicode" gives this error at normalization-check: ld: library not found for -lpgcommon clang: error: linker command failed with exit code 1 (use -v to see invocation) After commenting that out, the next command "$(MAKE) -C contrib/unaccent $@" failed, seemingly because $(PYTHON) is empty unless --with-python was specified at configure time. > Open questions that are currently not handled consistently: > > - Should the downloaded files be listed in .gitignore? These files are transient byproducts of a build, and we don't want them committed, so they seem like a normal candidate for .gitignore. > - Should the downloaded files be cleaned by make clean (or distclean or > maintainer-clean or none)? It seems one would want to make clean without removing these files, and maintainer clean is for removing things that are preserved in distribution tarballs. So I would go with distclean. > - Should the generated files be excluded from pgindent? Currently, the > generated files will not pass pgindent unchanged, so that could cause > annoying whitespace battles when these files are updated and re-indented > around release time. I see what you mean in the norm table header. I think generated files should not be pgindent'd, since creating them is already a consistent, mechanical process, and their presentation is not as important as other code. Other comments: +print "/* generated by src/common/unicode/generate-unicode_combining_table.pl, do not edit */\n\n"; I would print out the full boilerplate like for other generated headers. Lastly, src/common/unicode/README is outdated (and possibly no longer useful at all?). -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-12-19 23:48, John Naylor wrote: > I gave "make update-unicode" a try. It's unclear to me what the state > of the build tree should be when a maintainer runs this, so I'll just > report what happens when running naively (on MacOS). Yeah, that wasn't fully thought through, it appears. > After only running configure, "make update-unicode" gives this error > at normalization-check: > > ld: library not found for -lpgcommon > clang: error: linker command failed with exit code 1 (use -v to see invocation) Fixed by adding more make dependencies. > After commenting that out, the next command "$(MAKE) -C > contrib/unaccent $@" failed, seemingly because $(PYTHON) is empty > unless --with-python was specified at configure time. I'm not sure whether that's worth addressing. >> Open questions that are currently not handled consistently: >> >> - Should the downloaded files be listed in .gitignore? > > These files are transient byproducts of a build, and we don't want > them committed, so they seem like a normal candidate for .gitignore. OK done >> - Should the downloaded files be cleaned by make clean (or distclean or >> maintainer-clean or none)? > > It seems one would want to make clean without removing these files, > and maintainer clean is for removing things that are preserved in > distribution tarballs. So I would go with distclean. also done >> - Should the generated files be excluded from pgindent? Currently, the >> generated files will not pass pgindent unchanged, so that could cause >> annoying whitespace battles when these files are updated and re-indented >> around release time. > > I see what you mean in the norm table header. I think generated files > should not be pgindent'd, since creating them is already a consistent, > mechanical process, and their presentation is not as important as > other code. I've left it alone for now because the little indentation problem currently present might actually go away with my Unicode normalization support patch. > Other comments: > > +print "/* generated by > src/common/unicode/generate-unicode_combining_table.pl, do not edit > */\n\n"; > > I would print out the full boilerplate like for other generated headers. Hmm, you are probably comparing with src/common/unicode/generate-unicode_norm_table.pl, but other file generating scripts around the tree print out a small header in the style that I have. I'd rather adjust the output of generate-unicode_norm_table.pl to match those. (It's also not quite correct to make copyright claims about automatically generated output.) > Lastly, src/common/unicode/README is outdated (and possibly no longer > useful at all?). updated new patch attached -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On Thu, Dec 26, 2019 at 12:39 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > > On 2019-12-19 23:48, John Naylor wrote: > > I would print out the full boilerplate like for other generated headers. > > Hmm, you are probably comparing with > src/common/unicode/generate-unicode_norm_table.pl, but other file > generating scripts around the tree print out a small header in the style > that I have. I'd rather adjust the output of > generate-unicode_norm_table.pl to match those. (It's also not quite > correct to make copyright claims about automatically generated output.) Hmm, the scripts I'm most familiar with have full headers. Your point about copyright makes sense, and using smaller file headers would aid readability of the scripts, but I also see how others may feel differently. v2 looks good to me, marked ready for committer. -- John Naylor https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2020-01-03 15:13, John Naylor wrote: > On Thu, Dec 26, 2019 at 12:39 PM Peter Eisentraut > <peter.eisentraut@2ndquadrant.com> wrote: >> >> On 2019-12-19 23:48, John Naylor wrote: >>> I would print out the full boilerplate like for other generated headers. >> >> Hmm, you are probably comparing with >> src/common/unicode/generate-unicode_norm_table.pl, but other file >> generating scripts around the tree print out a small header in the style >> that I have. I'd rather adjust the output of >> generate-unicode_norm_table.pl to match those. (It's also not quite >> correct to make copyright claims about automatically generated output.) > > Hmm, the scripts I'm most familiar with have full headers. Your point > about copyright makes sense, and using smaller file headers would aid > readability of the scripts, but I also see how others may feel > differently. > > v2 looks good to me, marked ready for committer. Committed, thanks. I have added a little tweak so that it works also without --with-python, to avoid gratuitous annoyances. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > Committed, thanks. This patch is making src/tools/pginclude/headerscheck unhappy: ./src/include/common/unicode_combining_table.h:3: error: array type has incomplete element type I guess that header needs another #include, or else you need to move some declarations around. regards, tom lane
On 2020-01-15 01:37, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >> Committed, thanks. > > This patch is making src/tools/pginclude/headerscheck unhappy: > > ./src/include/common/unicode_combining_table.h:3: error: array type has incomplete element type > > I guess that header needs another #include, or else you need to > move some declarations around. Hmm, this file is only meant to be included inside one particular function. Making it standalone includable would seem to be unnecessary. What should we do? -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 2020-01-15 01:37, Tom Lane wrote: >> This patch is making src/tools/pginclude/headerscheck unhappy: >> ./src/include/common/unicode_combining_table.h:3: error: array type has incomplete element type > Hmm, this file is only meant to be included inside one particular > function. Making it standalone includable would seem to be unnecessary. > What should we do? Well, we could make it a documented exception in headerscheck and cpluspluscheck. regards, tom lane
On 2020-01-20 16:43, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >> On 2020-01-15 01:37, Tom Lane wrote: >>> This patch is making src/tools/pginclude/headerscheck unhappy: >>> ./src/include/common/unicode_combining_table.h:3: error: array type has incomplete element type > >> Hmm, this file is only meant to be included inside one particular >> function. Making it standalone includable would seem to be unnecessary. >> What should we do? > > Well, we could make it a documented exception in headerscheck and > cpluspluscheck. OK, done. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I have committed the first Unicode data update using this new "make update-unicode" facility. CLDR is released regularly every 6 months, so around this time every year would be the appropriate time to pull in the latest updates in preparation for our own release. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services