Обсуждение: Improve docs syntax checking and enable it in the meson build
Hi,
The Meson build did not include tab and non-breaking space checks for
the docs. The attached patch adds these checks and includes a few
related improvements.
This topic was previously discussed towards end of the another thread
[1], but it was decided that it would be better to have a separate
thread for it, so I am continuing the discussion here.
These checks were previously done in the Makefile:
```
# tabs are harmless, but it is best to avoid them in SGML files
check-tabs:
@( ! grep ' ' $(wildcard $(srcdir)/*.sgml $(srcdir)/func/*.sgml
$(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || \
(echo "Tabs appear in SGML/XML files" 1>&2; exit 1)
# Non-breaking spaces are harmless, but it is best to avoid them in SGML files.
# Use perl command because non-GNU grep or sed could not have hex
escape sequence.
check-nbsp:
@ ( $(PERL) -ne '/\xC2\xA0/ and print("$$ARGV:$$_"),$$n++; END
{exit($$n>0)}' \
$(wildcard $(srcdir)/*.sgml $(srcdir)/func/*.sgml
$(srcdir)/ref/*.sgml $(srcdir)/*.xsl $(srcdir)/images/*.xsl) ) || \
(echo "Non-breaking spaces appear in SGML/XML files" 1>&2; exit 1)
```
I moved these checks to a new Perl script called sgml_syntax_check.pl.
This script can also perform xmllint validation (when possible).
Here is a summary of the changes:
1 - A new sgml_syntax_check.pl script was added to handle tab, nbsp,
and xmllint validation checks.
1.1 - It is registered as the sgml_syntax_check test in the Meson build.
1.2 - These checks are run when executing 'make check' or 'meson test
sgml_syntax_check' commands.
1.3 - During the creation of postgres-full.xml, the script performs
tab and nbsp checks. The xmllint check is skipped there, since
validation is already handled by the --valid option. So, we do not run
the same check twice.
2 - The sgml_syntax_check test runs by default in the Meson build.
2.1 - Tab and nbsp checks always run.
2.2 - The xmllint validation and the test are skipped if the DocBook
can not be found. I was not able to achieve the same behavior in the
autoconf build, so the test is not run by default there. The Make
build continues to work as before, you can run the checks manually via
make check in doc/src/sgml.
[1]
https://www.postgresql.org/message-id/flat/CACJufxFgAh1--EMwOjMuANe%3DVTmjkNaZjH%2BAzSe04-8ZCGiESA%40mail.gmail.com
--
Regards,
Nazir Bilal Yavuz
Microsoft
Вложения
Hi, On Tue, 7 Oct 2025 at 16:12, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote: > > The Meson build did not include tab and non-breaking space checks for > the docs. The attached patch adds these checks and includes a few > related improvements. I have updated v6 to use a stamp file, as Andres suggested [1], to ensure a dependency between the syntax check and the postgres-full.xml file in the meson build. [1] https://postgr.es/m/tcjetkmnm4vtuyxakqvkqokvow6csjokdwwtplc5nl4zbpyjoo%40jjfhsuqa6fno -- Regards, Nazir Bilal Yavuz Microsoft
Вложения
On 07.10.25 15:12, Nazir Bilal Yavuz wrote: > 1 - A new sgml_syntax_check.pl script was added to handle tab, nbsp, > and xmllint validation checks. > 1.1 - It is registered as the sgml_syntax_check test in the Meson build. > 1.2 - These checks are run when executing 'make check' or 'meson test > sgml_syntax_check' commands. > 1.3 - During the creation of postgres-full.xml, the script performs > tab and nbsp checks. The xmllint check is skipped there, since > validation is already handled by the --valid option. So, we do not run > the same check twice. I think including the xmllint support in the new sgml_syntax_check is overkill, since the normal build already runs xmllint, or you could alternatively just write it into the build description file (makefile or meson.build). The build commands should be visible in the build description file, not layered into some other script. I suggest the following approach: - Change sgml_syntax_check.pl into a smaller script that just checks for tabs and nbsp. (Maybe a different name then.) - Add a call of that script to the build of postgres-full.xml. - Change the "check" target to just depend on postgres-full.xml, without its own commands. And then replicate that logic in meson.
Hi, Thank you for looking into this! On Wed, 15 Oct 2025 at 21:32, Peter Eisentraut <peter@eisentraut.org> wrote: > > On 07.10.25 15:12, Nazir Bilal Yavuz wrote: > > 1 - A new sgml_syntax_check.pl script was added to handle tab, nbsp, > > and xmllint validation checks. > > 1.1 - It is registered as the sgml_syntax_check test in the Meson build. > > 1.2 - These checks are run when executing 'make check' or 'meson test > > sgml_syntax_check' commands. > > 1.3 - During the creation of postgres-full.xml, the script performs > > tab and nbsp checks. The xmllint check is skipped there, since > > validation is already handled by the --valid option. So, we do not run > > the same check twice. > > I think including the xmllint support in the new sgml_syntax_check is > overkill, since the normal build already runs xmllint, or you could > alternatively just write it into the build description file (makefile or > meson.build). The build commands should be visible in the build > description file, not layered into some other script. > > I suggest the following approach: > > - Change sgml_syntax_check.pl into a smaller script that just checks for > tabs and nbsp. (Maybe a different name then.) > > - Add a call of that script to the build of postgres-full.xml. > > - Change the "check" target to just depend on postgres-full.xml, without > its own commands. > > And then replicate that logic in meson. All of these are addressed in v7. I think sgml_syntax_check is still a suitable name but I am open to suggestions. -- Regards, Nazir Bilal Yavuz Microsoft
Вложения
On 22.10.25 10:04, Nazir Bilal Yavuz wrote:
> diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
> index eca9d62fc22..1c937247a9a 100644
> --- a/.cirrus.tasks.yml
> +++ b/.cirrus.tasks.yml
> @@ -627,6 +627,8 @@ task:
> TEST_JOBS: 8
> IMAGE: ghcr.io/cirruslabs/macos-runner:sonoma
>
> + XML_CATALOG_FILES: /opt/local/share/xml/docbook/4.5/catalog.xml
> +
> CIRRUS_WORKING_DIR: ${HOME}/pgsql/
> CCACHE_DIR: ${HOME}/ccache
> MACPORTS_CACHE: ${HOME}/macports-cache
> @@ -641,6 +643,7 @@ task:
>
> MACOS_PACKAGE_LIST: >-
> ccache
> + docbook-xml-4.5
> icu
> kerberos5
> lz4
What is the reason for this change in this patch? AFAICT, your patch
doesn't perform any XML-related operations anymore.
Hi,
On Wed, 29 Oct 2025 at 20:24, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 22.10.25 10:04, Nazir Bilal Yavuz wrote:
> > diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
> > index eca9d62fc22..1c937247a9a 100644
> > --- a/.cirrus.tasks.yml
> > +++ b/.cirrus.tasks.yml
> > @@ -627,6 +627,8 @@ task:
> > TEST_JOBS: 8
> > IMAGE: ghcr.io/cirruslabs/macos-runner:sonoma
> >
> > + XML_CATALOG_FILES: /opt/local/share/xml/docbook/4.5/catalog.xml
> > +
> > CIRRUS_WORKING_DIR: ${HOME}/pgsql/
> > CCACHE_DIR: ${HOME}/ccache
> > MACPORTS_CACHE: ${HOME}/macports-cache
> > @@ -641,6 +643,7 @@ task:
> >
> > MACOS_PACKAGE_LIST: >-
> > ccache
> > + docbook-xml-4.5
> > icu
> > kerberos5
> > lz4
>
> What is the reason for this change in this patch? AFAICT, your patch
> doesn't perform any XML-related operations anymore.
It is because of the "sgml_syntax_check" test in the meson build. This
test builds "postgres-full.xml" (same logic with make check) and the
build of "postgres-full.xml" has xmllint check. I could not find a way
to make this test optional like check target in the doc/src/sgml, so
the "sgml_syntax_check" test always runs.
Apart from this, I realized that there is a better way to create
"sgml_syntax_check". It was depending on "postgres-full.xml" before,
but now it directly builds the "postgres-full.xml" like "tmp_install"
test. This is addressed in the v8.
--
Regards,
Nazir Bilal Yavuz
Microsoft
Вложения
RE: Improve docs syntax checking and enable it in the meson build
От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Nazir, I created a similar patch with yours because I did not recognize this thread. Not sure you are still interested in, but here are my comments based on my experience. 00. Needs rebase. 01. I found that postgres core was built when I ran `meson test --suite doc` command, but ideally it's not needed. Per my experiment, we must clarify dependencies for the test, something like below. ``` --- a/doc/src/sgml/meson.build +++ b/doc/src/sgml/meson.build @@ -337,7 +337,8 @@ test( 'doc' / 'sgml_syntax_check', meson_bin, args: ['compile', postgres_full_xml_target], - suite: 'doc' + suite: 'doc', + depends: doc_generated, ) ``` 02. Missing update of copyright. 03. I found that for the autoconf build system, syntax checking cannot work again if nothing is changed. This behavior is different from the HEAD. Per my experiment, below can fix the issue. ``` --- a/doc/src/sgml/Makefile +++ b/doc/src/sgml/Makefile @@ -201,6 +201,7 @@ MAKEINFO = makeinfo ## # Quick syntax check without style processing +.PHONY: postgres-full.xml check: postgres-full.xml ``` 04. `make check` is slightly slower than HEAD. I think it is OK but others can also confirm and can test on their env. HEAD: ``` $ time make check ... real 0m0.592s user 0m0.507s sys 0m0.084s ``` patched: ``` $ time make check ... real 0m1.217s user 0m0.973s sys 0m0.237s ``` Best regards, Hayato Kuroda FUJITSU LIMITED
Hi, On Mon, 16 Feb 2026 at 03:47, Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Nazir, > > I created a similar patch with yours because I did not recognize this thread. > Not sure you are still interested in, but here are my comments based on my > experience. Thank you for looking into this! > 00. > Needs rebase. Done. > 01. > I found that postgres core was built when I ran `meson test --suite doc` > command, but ideally it's not needed. Per my experiment, we must clarify > dependencies for the test, something like below. > > ``` > --- a/doc/src/sgml/meson.build > +++ b/doc/src/sgml/meson.build > @@ -337,7 +337,8 @@ test( > 'doc' / 'sgml_syntax_check', > meson_bin, > args: ['compile', postgres_full_xml_target], > - suite: 'doc' > + suite: 'doc', > + depends: doc_generated, > ) > ``` Sorry, I didn't understand what you meant, could you please elaborate? 'postgres_full_xml_target' target already depends on 'doc_generated' and 'sgml_check_stamp' targets. > 02. > Missing update of copyright. Done. > 03. > I found that for the autoconf build system, syntax checking cannot work again > if nothing is changed. This behavior is different from the HEAD. Per my > experiment, below can fix the issue. > > ``` > --- a/doc/src/sgml/Makefile > +++ b/doc/src/sgml/Makefile > @@ -201,6 +201,7 @@ MAKEINFO = makeinfo > ## > > # Quick syntax check without style processing > +.PHONY: postgres-full.xml > check: postgres-full.xml > ``` Yes, there is a behavior change but I don't think this is an issue. If nothing is changed, syntax check just reports it as: ``` $ make check make: Nothing to be done for 'check'. ``` > 04. > `make check` is slightly slower than HEAD. I think it is OK but others can also > confirm and can test on their env. > > HEAD: > ``` > $ time make check > ... > > real 0m0.592s > user 0m0.507s > sys 0m0.084s > ``` > > patched: > ``` > $ time make check > ... > > real 0m1.217s > user 0m0.973s > sys 0m0.237s > ``` It is slower on my end too: HEAD: real 0m0.344s user 0m0.301s sys 0m0.041s Patched: real 0m0.451s user 0m0.373s sys 0m0.076s The difference is not that big on my end, though. I think this is expected and acceptable but I would like to hear other people's thoughts too. -- Regards, Nazir Bilal Yavuz Microsoft
Вложения
RE: Improve docs syntax checking and enable it in the meson build
От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Nazir,
> > 01.
> > I found that postgres core was built when I ran `meson test --suite doc`
> > command, but ideally it's not needed. Per my experiment, we must clarify
> > dependencies for the test, something like below.
> >
> > ```
> > --- a/doc/src/sgml/meson.build
> > +++ b/doc/src/sgml/meson.build
> > @@ -337,7 +337,8 @@ test(
> > 'doc' / 'sgml_syntax_check',
> > meson_bin,
> > args: ['compile', postgres_full_xml_target],
> > - suite: 'doc'
> > + suite: 'doc',
> > + depends: doc_generated,
> > )
> > ```
>
> Sorry, I didn't understand what you meant, could you please elaborate?
Sure. Let's consider the below scenario, that I wanted to just verify the syntax
of documentations. For the autoconf/make build system, below commands can avoid
building C sources and generate needed .sgml files.
```
$ ./configure
...
$ cd doc/src/sgml/
$ make check
{ \
echo "<!ENTITY version \"19devel\">"; \
echo "<!ENTITY majorversion \"19\">"; \
} > version.sgml
'/usr/bin/perl' ./mk_feature_tables.pl YES ../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt> features-supported.sgml
'/usr/bin/perl' ./mk_feature_tables.pl NO ../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt> features-unsupported.sgml
'/usr/bin/perl' ./generate-errcodes-table.pl ../../../src/backend/utils/errcodes.txt > errcodes-table.sgml
'/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml
'/usr/bin/perl' ./generate-targets-meson.pl targets-meson.txt generate-targets-meson.pl > targets-meson.sgml
'/usr/bin/perl' ../../../src/backend/utils/activity/generate-wait_event_types.pl --docs
../../../src/backend/utils/activity/wait_event_names.txt
'/usr/bin/perl' ./sgml_syntax_check.pl --srcdir .
/usr/bin/xmllint --nonet --path . --path . --output postgres-full.xml --noent --valid postgres.sgml
```
In case of meson build system, however, added test was run after all C files
were built. Ideally it is not needed.
```
$ meson setup ../postgres/
...
$ meson test --suite doc
ninja: Entering directory `/home/hayato/builddir'
[884/2468] Compiling C object ... // C sources were built here
...
```
Also, I have been working on a project which translates the documentation [1],
and the syntax check is used for the github CI. I'm very sad if building
everything is needed to just check the doc.
> 'postgres_full_xml_target' target already depends on 'doc_generated'
> and 'sgml_check_stamp' targets.
I'm not faimilar with Meson, but my theory is that Meson cannot understand that
the test only depends on postgres_full_xml. That's why all sources are built.
Based on the discussion, the fix should be like;
```
--- a/doc/src/sgml/meson.build
+++ b/doc/src/sgml/meson.build
@@ -337,7 +337,8 @@ test(
'doc' / 'sgml_syntax_check',
meson_bin,
args: ['compile', postgres_full_xml_target],
- suite: 'doc'
+ suite: 'doc',
+ depends: postgres_full_xml
)
```
[1]: https://github.com/pgsql-jp/jpug-doc
Best regards,
Hayato Kuroda
FUJITSU LIMITED
Hi,
Thanks for the explanation!
On Mon, 16 Feb 2026 at 11:33, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Nazir,
>
> > > 01.
> > > I found that postgres core was built when I ran `meson test --suite doc`
> > > command, but ideally it's not needed. Per my experiment, we must clarify
> > > dependencies for the test, something like below.
> > >
> > > ```
> > > --- a/doc/src/sgml/meson.build
> > > +++ b/doc/src/sgml/meson.build
> > > @@ -337,7 +337,8 @@ test(
> > > 'doc' / 'sgml_syntax_check',
> > > meson_bin,
> > > args: ['compile', postgres_full_xml_target],
> > > - suite: 'doc'
> > > + suite: 'doc',
> > > + depends: doc_generated,
> > > )
> > > ```
> >
> > Sorry, I didn't understand what you meant, could you please elaborate?
>
> Sure. Let's consider the below scenario, that I wanted to just verify the syntax
> of documentations. For the autoconf/make build system, below commands can avoid
> building C sources and generate needed .sgml files.
>
> ```
> $ ./configure
> ...
> $ cd doc/src/sgml/
> $ make check
> { \
> echo "<!ENTITY version \"19devel\">"; \
> echo "<!ENTITY majorversion \"19\">"; \
> } > version.sgml
> '/usr/bin/perl' ./mk_feature_tables.pl YES ../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt> features-supported.sgml
> '/usr/bin/perl' ./mk_feature_tables.pl NO ../../../src/backend/catalog/sql_feature_packages.txt
../../../src/backend/catalog/sql_features.txt> features-unsupported.sgml
> '/usr/bin/perl' ./generate-errcodes-table.pl ../../../src/backend/utils/errcodes.txt > errcodes-table.sgml
> '/usr/bin/perl' ./generate-keywords-table.pl . > keywords-table.sgml
> '/usr/bin/perl' ./generate-targets-meson.pl targets-meson.txt generate-targets-meson.pl > targets-meson.sgml
> '/usr/bin/perl' ../../../src/backend/utils/activity/generate-wait_event_types.pl --docs
../../../src/backend/utils/activity/wait_event_names.txt
> '/usr/bin/perl' ./sgml_syntax_check.pl --srcdir .
> /usr/bin/xmllint --nonet --path . --path . --output postgres-full.xml --noent --valid postgres.sgml
> ```
>
> In case of meson build system, however, added test was run after all C files
> were built. Ideally it is not needed.
>
> ```
> $ meson setup ../postgres/
> ...
> $ meson test --suite doc
> ninja: Entering directory `/home/hayato/builddir'
> [884/2468] Compiling C object ... // C sources were built here
> ...
> ```
>
> Also, I have been working on a project which translates the documentation [1],
> and the syntax check is used for the github CI. I'm very sad if building
> everything is needed to just check the doc.
>
> > 'postgres_full_xml_target' target already depends on 'doc_generated'
> > and 'sgml_check_stamp' targets.
>
> I'm not faimilar with Meson, but my theory is that Meson cannot understand that
> the test only depends on postgres_full_xml. That's why all sources are built.
>
> Based on the discussion, the fix should be like;
>
> ```
> --- a/doc/src/sgml/meson.build
> +++ b/doc/src/sgml/meson.build
> @@ -337,7 +337,8 @@ test(
> 'doc' / 'sgml_syntax_check',
> meson_bin,
> args: ['compile', postgres_full_xml_target],
> - suite: 'doc'
> + suite: 'doc',
> + depends: postgres_full_xml
> )
> ```
I see your point but the suggested change actually doesn't affect C
files' compilation. They are still built although you make this
change, AFAIK this is how meson build works.
Meson has '--no-rebuild' option for these types of situations, could
you try running "meson test --suite doc --no-rebuild" and see if that
helps?
--
Regards,
Nazir Bilal Yavuz
Microsoft
RE: Improve docs syntax checking and enable it in the meson build
От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Nazir, > I see your point but the suggested change actually doesn't affect C > files' compilation. They are still built although you make this > change, AFAIK this is how meson build works. Did you say that C files would be anyway built even after applying my change? I don't think so - I have verified that only needed documents were built with the `meson test --suite doc` command after I modified meson.build. ``` $ ninja clean [2/2] Cleaning Cleaning... 8 files. $ meson test --suite doc ninja: Entering directory `/home/hayato/builddir' [8/8] Generating doc/src/sgml/postgres-full.xml with a custom command // only 8 files are generated 1/1 postgresql:doc / doc/sgml_syntax_check OK 0.38s Ok: 1 Expected Fail: 0 Fail: 0 Unexpected Pass: 0 Skipped: 0 Timeout: 0 Full log written to /home/hayato/builddir/meson-logs/testlog.txt ``` > Meson has '--no-rebuild' option for these types of situations, could > you try running "meson test --suite doc --no-rebuild" and see if that > helps? Confirmed that `--no-rebuild` option can avoid building C files, but I still think it should be handled on the meson.build side. Best regards, Hayato Kuroda FUJITSU LIMITED
Hi, On Mon, 16 Feb 2026 at 13:49, Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Nazir, > > > I see your point but the suggested change actually doesn't affect C > > files' compilation. They are still built although you make this > > change, AFAIK this is how meson build works. > > Did you say that C files would be anyway built even after applying my change? > I don't think so - I have verified that only needed documents were built with > the `meson test --suite doc` command after I modified meson.build. > > ``` > $ ninja clean > [2/2] Cleaning > Cleaning... 8 files. > $ meson test --suite doc > ninja: Entering directory `/home/hayato/builddir' > [8/8] Generating doc/src/sgml/postgres-full.xml with a custom command // only 8 files are generated > 1/1 postgresql:doc / doc/sgml_syntax_check OK 0.38s > > Ok: 1 > Expected Fail: 0 > Fail: 0 > Unexpected Pass: 0 > Skipped: 0 > Timeout: 0 > > Full log written to /home/hayato/builddir/meson-logs/testlog.txt > ``` I think we might be using different meson versions, which could explain the difference. I am currently using meson v1.7.0. 1) With meson v1.7.0, it builds all files regardless of whether "depends: postgres_full_xml" is added. 2) I also tested this manually and noticed a different behavior starting from meson v1.9.2. In these versions, meson does not build unnecessary files, whether "depends: postgres_full_xml" is added: meson v1.9.1 without "depends: postgres_full_xml": ``` $ ninja clean && meson test --suite doc [2/2] Cleaning Cleaning... 0 files. ninja: Entering directory `/home/nbyavuz/Desktop/projects/postgres/build' [2551/2551] Linking target src/backend/postgres 1/1 postgresql:doc / doc/sgml_syntax_check OK 0.61s Ok: 1 Fail: 0 ``` ------------------------------ meson v1.9.1 with "depends: postgres_full_xml", please note that 8 extra files are compiled because of postgres_full_xml [1]: ``` $ ninja clean && meson test --suite doc [2/2] Cleaning Cleaning... 0 files. ninja: Entering directory `/home/nbyavuz/Desktop/projects/postgres/build' [2559/2559] Linking target src/backend/postgres 1/1 postgresql:doc / doc/sgml_syntax_check OK 0.18s Ok: 1 Fail: 0 ``` ------------------------------ meson v1.9.2 without "depends: postgres_full_xml": ``` $ ninja clean && meson test --suite doc [2/2] Cleaning Cleaning... 8 files. 1/1 doc - postgresql:doc/sgml_syntax_check OK 0.61s Ok: 1 Fail: 0 ``` ------------------------------ $ meson v1.9.2 with "depends: postgres_full_xml", please note that 8 extra files are compiled because of postgres_full_xml [1]: `` ninja clean && meson test --suite doc [2/2] Cleaning Cleaning... 8 files. ninja: Entering directory `/home/nbyavuz/Desktop/projects/postgres/build' [8/8] Generating doc/src/sgml/postgres-full.xml with a custom command <- The difference [1] 1/1 doc - postgresql:doc/sgml_syntax_check OK 0.18s Ok: 1 Fail: 0 ``` ------------------------------ If my observation is correct, it seems there may be no need to add "depends: postgres_full_xml", since the test itself is about compiling 'postgres_full_xml', and it doesn’t need to be built separately beforehand [1]. Please let me know if you are seeing something different on your side. -- Regards, Nazir Bilal Yavuz Microsoft
On 2026-Feb-16, Nazir Bilal Yavuz wrote: > I see your point but the suggested change actually doesn't affect C > files' compilation. They are still built although you make this > change, AFAIK this is how meson build works. If that's the case, then this whole thing is almost useless and Meson is being ridiculous. Hopefully you're wrong on this, and there exists a way to check the XML docs without having to have the whole tree already built. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Hi, On Mon, 16 Feb 2026 at 15:36, Álvaro Herrera <alvherre@kurilemu.de> wrote: > > On 2026-Feb-16, Nazir Bilal Yavuz wrote: > > > I see your point but the suggested change actually doesn't affect C > > files' compilation. They are still built although you make this > > change, AFAIK this is how meson build works. > > If that's the case, then this whole thing is almost useless and Meson is > being ridiculous. Hopefully you're wrong on this, and there exists a > way to check the XML docs without having to have the whole tree already > built. Sorry, meson doesn't always work like that. It seems the --suite option triggers building an entire tree. You can run the XML check without building the tree using one of the following commands: - meson test doc/sgml_syntax_check - meson test --suite doc --no-rebuild I am not sure if there is a way to prevent the full build when using '--suite' without '--no-rebuild' before meson v1.9.2. This problem was fixed in the commit [1]. I tested meson v0.57.2 (the oldest version supported by Postgres), and it has the same problem too. [1] https://github.com/mesonbuild/meson/commit/18c1ce716b8fe43cce090daa851b64cbbbdbd532 -- Regards, Nazir Bilal Yavuz Microsoft
Hi,
On 2026-02-16 10:39:06 +0300, Nazir Bilal Yavuz wrote:
> # Run validation only once, common to all subsequent targets. While
> # we're at it, also resolve all entities (that is, copy all included
> # files into one big file). This helps tools that don't understand
> @@ -108,12 +128,15 @@ postgres_full_xml = custom_target('postgres-full.xml',
> depfile: 'postgres-full.xml.d',
> command: [xmllint, '--nonet', '--noent', '--valid',
> '--path', '@OUTDIR@', '-o', '@OUTPUT@', '@INPUT@'],
> - depends: doc_generated,
> + depends: [doc_generated, sgml_check_stamp],
> build_by_default: false,
> )
> docs += postgres_full_xml
> alldocs += postgres_full_xml
Not sure I love that now the entire docs build fails if there is a tab
anywhere. It's probably ok, but ...
Given you add it as a test for meson and a check for make, why not just make
sgml_check_stamp a dependency of the test (meson) and check targets (make)?
> +postgres_full_xml_target = 'postgres_full_xml'
> +alias_target(postgres_full_xml_target, postgres_full_xml)
Why would we want a top-level postgres_full_xml target?
> if xsltproc_bin.found()
> xsltproc_flags = [
> @@ -308,3 +331,13 @@ endif
> if alldocs.length() != 0
> alias_target('alldocs', alldocs)
> endif
> +
> +# Test creation of 'postgres_full_xml'.
> +test(
> + 'doc' / 'sgml_syntax_check',
> + meson_bin,
> + args: ['compile', postgres_full_xml_target],
> + suite: 'doc'
> +)
I'm confused about this 'compile' bit here. What is that intending to do? I
don't have such a 'compile' command in my PATH, so the test fails here...
> diff --git a/doc/src/sgml/sgml_syntax_check.pl b/doc/src/sgml/sgml_syntax_check.pl
> new file mode 100755
> index 00000000000..61b4cd203c0
> --- /dev/null
> +++ b/doc/src/sgml/sgml_syntax_check.pl
> @@ -0,0 +1,115 @@
> +#!/usr/bin/perl
> +#
> +# Perl script that checks SGML/XSL syntax and formatting issues.
> +#
> +# doc/src/sgml/sgml_syntax_check.pl
> +#
> +# Copyright (c) 2026, PostgreSQL Global Development Group
> +
> +use strict;
> +use warnings FATAL => 'all';
> +use Getopt::Long;
> +
> +use File::Find;
> +
> +my $srcdir;
> +my $builddir;
> +my $dep_file;
> +my $stamp_file;
> +
> +GetOptions(
> + 'srcdir:s' => \$srcdir,
> + 'builddir:s' => \$builddir,
> + 'dep_file:s' => \$dep_file,
> + 'stamp_file:s' => \$stamp_file) or die "$0: wrong arguments";
I don't think there's a point in having a dep file here. Dependency files are
only useful if you have dependency on *additional* files to the ones
explicitly passed in (e.g. a .c file may depend on headers, but the headers
aren't listed as arguments to the compiler).
> +die "$0: --srcdir must be specified\n" unless defined $srcdir;
> +
> +# Find files to process - all the sgml and xsl files (including in subdirectories)
> +my @files_to_process;
> +my @dirs_to_search = ($srcdir);
> +push @dirs_to_search, $builddir if defined $builddir;
> +find(
> + sub {
> + return unless -f $_;
> + return if $_ !~ /\.(sgml|xsl)$/;
> + push @files_to_process, $File::Find::name;
> + },
> + @dirs_to_search,);
Why is this any more complicated than just opening the file exactly in the
passed in source dir?
> +# tabs and non-breaking spaces are harmless, but it is best to avoid them in SGML files
> +sub check_tabs_and_nbsp
> +{
> + my $errors = 0;
> + for my $f (@files_to_process)
> + {
> + open my $fh, "<:encoding(UTF-8)", $f or die "Can't open $f: $!";
> + my $line_no = 0;
> + while (<$fh>)
> + {
> + $line_no++;
> + if (/\t/)
> + {
> + print STDERR "Tab found in $f:$line_no\n";
> + $errors++;
> + }
> + if (/\xC2\xA0/)
> + {
> + print STDERR "$f:$line_no: contains non-breaking space\n";
> + $errors++;
> + }
> + }
> + close($fh);
> + }
> +
> + if ($errors)
> + {
> + remove_stamp_file();
I don't think that's needed, the stamp file needs to be newer than the inputs
to have an effect, and here you're just removing an old one, no?
> +sub create_stamp_file
> +{
> + # Avoid touching existing stamp file to prevent unnecessary rebuilds
> + if (!(-f $stamp_file))
> + {
> + open my $fh, '>', $stamp_file
> + or die "can't open $stamp_file: $!";
> + close $fh;
> + }
> +}
Won't that *cause* rebuilds? With a stamp file you normally want the stamp
file to be *newer* than the inputs. Which it won't be, if you don't touch it
here.
The only reason it doesn't cause quick rebuilds with meson is that ninja
remembers the timestamps of files an avoids rebuilds if they haven't changed.
Greetings,
Andres Freund
RE: Improve docs syntax checking and enable it in the meson build
От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Nazir, > I think we might be using different meson versions, which could > explain the difference. I am currently using meson v1.7.0. I verified on my two environments, and both could work as expected. Your patch was applied then doc/src/sgml/meson.build was updated from the HEAD. Almalinux 9: ``` $ meson -v 0.63.3 $ ninja clean && meson test --suite doc [2/2] Cleaning Cleaning... 8 files. ninja: Entering directory `/home/hayato/builddir' [8/8] Generating doc/src/sgml/postgres-full.xml with a custom command 1/1 postgresql:doc / doc/sgml_syntax_check OK 0.39s Ok: 1 Expected Fail: 0 Fail: 0 Unexpected Pass: 0 Skipped: 0 Timeout: 0 Full log written to /home/hayato/builddir/meson-logs/testlog.txt ``` Ubuntsu 24.04 LTS ``` $ meson -v 1.3.2 $ ninja clean && meson test --suite doc [2/2] Cleaning Cleaning... 8 files. ninja: Entering directory `/home/hayato/builddir' [8/8] Generating doc/src/sgml/postgres-full.xml with a custom command 1/1 postgresql:doc / doc/sgml_syntax_check OK 0.50s Ok: 1 Expected Fail: 0 Fail: 0 Unexpected Pass: 0 Skipped: 0 Timeout: 0 Full log written to /home/hayato/builddir/meson-logs/testlog.txt ``` As Álvaro pointed out [1], I also think there should be a way to build some needed items. [1]: https://www.postgresql.org/message-id/202602161234.ujd6lsyxxnh4%40alvherre.pgsql Best regards, Hayato Kuroda FUJITSU LIMITED
Hi, On Tue, 17 Feb 2026 at 04:56, Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Nazir, > > > I think we might be using different meson versions, which could > > explain the difference. I am currently using meson v1.7.0. > > I verified on my two environments, and both could work as expected. > Your patch was applied then doc/src/sgml/meson.build was updated > from the HEAD. Thank you for the quick testing! I’m able to reproduce the same behavior on my Ubuntu machine using meson v1.3.2. During my initial testing, I was using meson from the GitHub source by cloning the repository and running meson.py from the repository root. However, this setup didn’t allow me to test correctly because 'meson_bin = find_program(meson_binpath, native: true)' could not locate the binary when using meson.py directly from the meson repository. After correcting the path manually, I was able to reproduce the behavior you described. Then, I confirmed that 'meson test --suite doc' is problematic in meson v[1.7.0, 1.9.1], '--suite' option triggers building the entire source tree whether you set 'depends: postgres_full_xml'. Andres already reviewed the entire patch [1] and there are points which can completely change the structure of the patch. I am planning to address Andres' feedback first, and then revisit whether we still need to add 'depends: postgres_full_xml'. Thanks again for the report! [1] https://postgr.es/m/nxszznj6nkdj2vh5cxvwzpwodsbtui6mqluxnur2y4x7xtkirw%40fbvhy54mluzl -- Regards, Nazir Bilal Yavuz Microsoft
Hi,
Thank you for looking into this!
On Mon, 16 Feb 2026 at 23:38, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2026-02-16 10:39:06 +0300, Nazir Bilal Yavuz wrote:
> > # Run validation only once, common to all subsequent targets. While
> > # we're at it, also resolve all entities (that is, copy all included
> > # files into one big file). This helps tools that don't understand
> > @@ -108,12 +128,15 @@ postgres_full_xml = custom_target('postgres-full.xml',
> > depfile: 'postgres-full.xml.d',
> > command: [xmllint, '--nonet', '--noent', '--valid',
> > '--path', '@OUTDIR@', '-o', '@OUTPUT@', '@INPUT@'],
> > - depends: doc_generated,
> > + depends: [doc_generated, sgml_check_stamp],
> > build_by_default: false,
> > )
> > docs += postgres_full_xml
> > alldocs += postgres_full_xml
>
> Not sure I love that now the entire docs build fails if there is a tab
> anywhere. It's probably ok, but ...
>
> Given you add it as a test for meson and a check for make, why not just make
> sgml_check_stamp a dependency of the test (meson) and check targets (make)?
I followed what Peter suggested here [1]. The idea was basically
running these checks while building the 'postgres.full.xml', not only
running the test.
> > +postgres_full_xml_target = 'postgres_full_xml'
> > +alias_target(postgres_full_xml_target, postgres_full_xml)
>
> Why would we want a top-level postgres_full_xml target?
The idea was using it with the 'meson compile $target' command [2]
which I will explain below.
> > if xsltproc_bin.found()
> > xsltproc_flags = [
> > @@ -308,3 +331,13 @@ endif
> > if alldocs.length() != 0
> > alias_target('alldocs', alldocs)
> > endif
> > +
> > +# Test creation of 'postgres_full_xml'.
> > +test(
> > + 'doc' / 'sgml_syntax_check',
> > + meson_bin,
> > + args: ['compile', postgres_full_xml_target],
> > + suite: 'doc'
> > +)
>
> I'm confused about this 'compile' bit here. What is that intending to do? I
> don't have such a 'compile' command in my PATH, so the test fails here...
My intent was to have a syntax check that will run while we build the
'postgres_full_xml'. And since we already have that, the test just can
try rebuilding the target with the 'meson compile $target' command
[2]. I converted this to:
# The actual test is whether postgres_full_xml can be built, as it has a
# xmllint check and also depends on sgml_check_stamp which performs the syntax
# check.
test(
'doc' / 'sgml_syntax_check',
python,
args: ['-c', ''],
depends: [postgres_full_xml],
suite: 'doc',
)
and removed the top-level postgres_full_xml target. This should work
on any platform now. Does that work for you?
> > diff --git a/doc/src/sgml/sgml_syntax_check.pl b/doc/src/sgml/sgml_syntax_check.pl
> > new file mode 100755
> > index 00000000000..61b4cd203c0
> > --- /dev/null
> > +++ b/doc/src/sgml/sgml_syntax_check.pl
> > @@ -0,0 +1,115 @@
> > +#!/usr/bin/perl
> > +#
> > +# Perl script that checks SGML/XSL syntax and formatting issues.
> > +#
> > +# doc/src/sgml/sgml_syntax_check.pl
> > +#
> > +# Copyright (c) 2026, PostgreSQL Global Development Group
> > +
> > +use strict;
> > +use warnings FATAL => 'all';
> > +use Getopt::Long;
> > +
> > +use File::Find;
> > +
> > +my $srcdir;
> > +my $builddir;
> > +my $dep_file;
> > +my $stamp_file;
> > +
> > +GetOptions(
> > + 'srcdir:s' => \$srcdir,
> > + 'builddir:s' => \$builddir,
> > + 'dep_file:s' => \$dep_file,
> > + 'stamp_file:s' => \$stamp_file) or die "$0: wrong arguments";
>
> I don't think there's a point in having a dep file here. Dependency files are
> only useful if you have dependency on *additional* files to the ones
> explicitly passed in (e.g. a .c file may depend on headers, but the headers
> aren't listed as arguments to the compiler).
We had a chat with Andres off-list about this. Andres thought the
sgml_syntax_check.pl script accepts files as in input rather than
directory. That comment was true if we directly send the file as an
input argument since meson will automatically append this file to the
dependency file. However, we rather send a directory as an input and
manually create a dependency file with the *.sgml and *.xsl files in
the directory.
> > +die "$0: --srcdir must be specified\n" unless defined $srcdir;
> > +
> > +# Find files to process - all the sgml and xsl files (including in subdirectories)
> > +my @files_to_process;
> > +my @dirs_to_search = ($srcdir);
> > +push @dirs_to_search, $builddir if defined $builddir;
> > +find(
> > + sub {
> > + return unless -f $_;
> > + return if $_ !~ /\.(sgml|xsl)$/;
> > + push @files_to_process, $File::Find::name;
> > + },
> > + @dirs_to_search,);
>
> Why is this any more complicated than just opening the file exactly in the
> passed in source dir?
Same as above, sgml_syntax_check.pl script accepts directory as in
input rather than files.
> > +# tabs and non-breaking spaces are harmless, but it is best to avoid them in SGML files
> > +sub check_tabs_and_nbsp
> > +{
> > + my $errors = 0;
> > + for my $f (@files_to_process)
> > + {
> > + open my $fh, "<:encoding(UTF-8)", $f or die "Can't open $f: $!";
> > + my $line_no = 0;
> > + while (<$fh>)
> > + {
> > + $line_no++;
> > + if (/\t/)
> > + {
> > + print STDERR "Tab found in $f:$line_no\n";
> > + $errors++;
> > + }
> > + if (/\xC2\xA0/)
> > + {
> > + print STDERR "$f:$line_no: contains non-breaking space\n";
> > + $errors++;
> > + }
> > + }
> > + close($fh);
> > + }
> > +
> > + if ($errors)
> > + {
> > + remove_stamp_file();
>
> I don't think that's needed, the stamp file needs to be newer than the inputs
> to have an effect, and here you're just removing an old one, no?
Yes, done.
> > +sub create_stamp_file
> > +{
> > + # Avoid touching existing stamp file to prevent unnecessary rebuilds
> > + if (!(-f $stamp_file))
> > + {
> > + open my $fh, '>', $stamp_file
> > + or die "can't open $stamp_file: $!";
> > + close $fh;
> > + }
> > +}
>
> Won't that *cause* rebuilds? With a stamp file you normally want the stamp
> file to be *newer* than the inputs. Which it won't be, if you don't touch it
> here.
>
> The only reason it doesn't cause quick rebuilds with meson is that ninja
> remembers the timestamps of files an avoids rebuilds if they haven't changed.
Yes, you are right. Done.
[1] https://postgr.es/m/02d46e55-418e-4cb2-ad36-c43d6172a010%40eisentraut.org
[2] https://mesonbuild.com/Commands.html#compile
--
Regards,
Nazir Bilal Yavuz
Microsoft