Обсуждение: Re: split func.sgml to separated individual sgml files

Поиск
Список
Период
Сортировка

Re: split func.sgml to separated individual sgml files

От
Corey Huinker
Дата:
The following is step-by-step logic.


The end result (one file per section) seems good to me.

I suspect that reviewer burden may be the biggest barrier to going forward. Perhaps breaking up the changes so that each new sect1 file gets its own commit, allowing the reviewer to more easily (if not programmatically) verify that the text that moved out of func.sgml moved into func-sect-foo.sgml.

Granted, the committer will likely squash all of those commits down into one big one, but by the the hard work of reviewing is done by then.






Re: split func.sgml to separated individual sgml files

От
"David G. Johnston"
Дата:
On Wed, Nov 13, 2024 at 1:11 PM Corey Huinker <corey.huinker@gmail.com> wrote:
The following is step-by-step logic.


The end result (one file per section) seems good to me.

I suspect that reviewer burden may be the biggest barrier to going forward. Perhaps breaking up the changes so that each new sect1 file gets its own commit, allowing the reviewer to more easily (if not programmatically) verify that the text that moved out of func.sgml moved into func-sect-foo.sgml.

Granted, the committer will likely squash all of those commits down into one big one, but by the the hard work of reviewing is done by then.


Validation is pretty trivial.  I just built the before and after HTML files and confirmed they are exactly the same size.

I suppose we might have lost some comments or something that wouldn't end up visible in the HTML (seems unlikely) but this is basically one-and-done so long as you don't let other commits happen (that touch this area) while you extract and build HEAD and then compare it to the patched build results.  The git diff will let us know the script didn't affect any source files it wasn't supposed to.

In short, ready to commit (see last paragraph below however), but the committer will need to run the python script at the time of commit on the then-current tree.

In my recent patch touching filelist.sgml I would be placing this new %allfiles_func; line pairing at the top just beneath %allfiles; which is the first child element.  But the choice made here makes sense should this go in first.

There is little downside, though, to renaming the existing %allfiles; to %allfiles_ref; It's a local-only name.

David J.

Re: split func.sgml to separated individual sgml files

От
jian he
Дата:
hi.

after run the v2 python script and ``git apply
v2-0001-update-filelist.sgml-allfiles.sgml.no-cfbot``
git status -u
shows:

Changes not staged for commit:
  (use "git add/rm <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   doc/src/sgml/filelist.sgml
        deleted:    doc/src/sgml/func.sgml

That means to verify the changes, we only need to verify html files
related to "functions".

I use GNU diff to compare the HTML output of doc/src/sgml/func.sgml generated
from the master branch against the HTML file produced by the patch.
For example, $DOC9 is the PATCH (split func.sgml) html file directory, $DOC5 is
the master branch html file directory.  and no message produced while running
diff, which means the patch (with the script) produced output is the
same as the master branch.

diff $DOC5/functions.html $DOC9/functions.html
diff $DOC5/functions-logical.html $DOC9/functions-logical.html
diff $DOC5/functions-comparison.html $DOC9/functions-comparison.html
diff $DOC5/functions-math.html $DOC9/functions-math.html
diff $DOC5/functions-string.html $DOC9/functions-string.html
diff $DOC5/functions-binarystring.html $DOC9/functions-binarystring.html
diff $DOC5/functions-matching.html $DOC9/functions-matching.html
diff $DOC5/functions-formatting.html $DOC9/functions-formatting.html
diff $DOC5/functions-datetime.html $DOC9/functions-datetime.html
diff $DOC5/functions-enum.html $DOC9/functions-enum.html
diff $DOC5/functions-geometry.html $DOC9/functions-geometry.html
diff $DOC5/functions-net.html $DOC9/functions-net.html
diff $DOC5/functions-textsearch.html $DOC9/functions-textsearch.html
diff $DOC5/functions-uuid.html $DOC9/functions-uuid.html
diff $DOC5/functions-xml.html $DOC9/functions-xml.html
diff $DOC5/functions-json.html $DOC9/functions-json.html
diff $DOC5/functions-sequence.html $DOC9/functions-sequence.html
diff $DOC5/functions-conditional.html $DOC9/functions-conditional.html
diff $DOC5/functions-array.html $DOC9/functions-array.html
diff $DOC5/functions-range.html $DOC9/functions-range.html
diff $DOC5/functions-aggregate.html $DOC9/functions-aggregate.html
diff $DOC5/functions-window.html $DOC9/functions-window.html
diff $DOC5/functions-merge-support.html $DOC9/functions-merge-support.html
diff $DOC5/functions-subquery.html $DOC9/functions-subquery.html
diff $DOC5/functions-comparisons.html $DOC9/functions-comparisons.html
diff $DOC5/functions-srf.html $DOC9/functions-srf.html
diff $DOC5/functions-info.html $DOC9/functions-info.html
diff $DOC5/functions-admin.html $DOC9/functions-admin.html
diff $DOC5/functions-trigger.html $DOC9/functions-trigger.html
diff $DOC5/functions-event-triggers.html $DOC9/functions-event-triggers.html
diff $DOC5/functions-statistics.html $DOC9/functions-statistics.html



Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:


On 2025-07-29 Tu 2:15 AM, jian he wrote:
hi.

after run the v2 python script and ``git apply
v2-0001-update-filelist.sgml-allfiles.sgml.no-cfbot``
git status -u
shows:

Changes not staged for commit:  (use "git add/rm <file>..." to update what will be committed)  (use "git restore <file>..." to discard changes in working directory)        modified:   doc/src/sgml/filelist.sgml        deleted:    doc/src/sgml/func.sgml

That means to verify the changes, we only need to verify html files
related to "functions".

I use GNU diff to compare the HTML output of doc/src/sgml/func.sgml generated
from the master branch against the HTML file produced by the patch.
For example, $DOC9 is the PATCH (split func.sgml) html file directory, $DOC5 is
the master branch html file directory.  and no message produced while running
diff, which means the patch (with the script) produced output is the
same as the master branch.

[snip]


OK. I'm inclined to do this after the CF finishes, to avoid collisions with other patches. I assume it's going to make the CFbot fairly unhappy.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: split func.sgml to separated individual sgml files

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> OK. I'm inclined to do this after the CF finishes, to avoid collisions 
> with other patches. I assume it's going to make the CFbot fairly unhappy.

+1 for proceeding that way.  (I did not look at whether the proposed
changes are sane, but I agree that this'll inevitably break a lot of
pending patches.)

            regards, tom lane



Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-07-29 Tu 11:40 AM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> OK. I'm inclined to do this after the CF finishes, to avoid collisions
>> with other patches. I assume it's going to make the CFbot fairly unhappy.
> +1 for proceeding that way.  (I did not look at whether the proposed
> changes are sane, but I agree that this'll inevitably break a lot of
> pending patches.)
>
>             


Done.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: split func.sgml to separated individual sgml files

От
Florents Tselai
Дата:


On 4 Aug 2025, at 4:09 PM, Andrew Dunstan <andrew@dunslane.net> wrote:


On 2025-07-29 Tu 11:40 AM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
OK. I'm inclined to do this after the CF finishes, to avoid collisions
with other patches. I assume it's going to make the CFbot fairly unhappy.
+1 for proceeding that way.  (I did not look at whether the proposed
changes are sane, but I agree that this'll inevitably break a lot of
pending patches.)




Done.


I discovered that when changing for func/func-aggregate.sgml, the HTML wasn’t marked for update.

IIUC the doc/Makefile should be updated as attached, right ?

Вложения

Re: split func.sgml to separated individual sgml files

От
"Euler Taveira"
Дата:
On Mon, Sep 1, 2025, at 7:35 AM, Florents Tselai wrote:
> While working on this https://commitfest.postgresql.org/patch/6020/
> I discovered that when changing for func/func-aggregate.sgml, the HTML
> wasn’t marked for update.
>
> IIUC the doc/Makefile should be updated as attached, right ?
>

Good catch.

However, your patch doesn't fix all issues. The check target (check-tabs and
check-nbsp) is broken; these targets should also include the func files.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
Вложения

Re: split func.sgml to separated individual sgml files

От
Florents Tselai
Дата:


On 1 Sep 2025, at 4:35 PM, Euler Taveira <euler@eulerto.com> wrote:

On Mon, Sep 1, 2025, at 7:35 AM, Florents Tselai wrote:
While working on this https://commitfest.postgresql.org/patch/6020/
I discovered that when changing for func/func-aggregate.sgml, the HTML
wasn’t marked for update.

IIUC the doc/Makefile should be updated as attached, right ?


Good catch.

However, your patch doesn't fix all issues. The check target (check-tabs and
check-nbsp) is broken; these targets should also include the func files.


Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
v3 does that.
Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too. 



Вложения

Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:


On 2025-09-01 Mo 11:44 AM, Florents Tselai wrote:


On 1 Sep 2025, at 4:35 PM, Euler Taveira <euler@eulerto.com> wrote:

On Mon, Sep 1, 2025, at 7:35 AM, Florents Tselai wrote:
While working on this https://commitfest.postgresql.org/patch/6020/
I discovered that when changing for func/func-aggregate.sgml, the HTML
wasn’t marked for update.

IIUC the doc/Makefile should be updated as attached, right ?


Good catch.

However, your patch doesn't fix all issues. The check target (check-tabs and
check-nbsp) is broken; these targets should also include the func files.


Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
v3 does that.
Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too. 




Do we actually care about those? I don't want to add needless cycles anywhere. I note that the meson.build doesn't appear to have a check target at all, or anything that looks for hard tabs or nbsps.Those checks were added to the Makefile back in October in commit 5b7da5c261d, but that got missed even though Daniel had mentioned it in the discussion thread.[1]


cheers


andrew


[1] https://www.postgresql.org/message-id/F7102912-0BDA-42A3-BDCF-8A4CBD1CC688%40yesql.se

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: split func.sgml to separated individual sgml files

От
Florents Tselai
Дата:



On Tue, Sep 2, 2025 at 5:54 PM Andrew Dunstan <andrew@dunslane.net> wrote:


On 2025-09-01 Mo 11:44 AM, Florents Tselai wrote:


On 1 Sep 2025, at 4:35 PM, Euler Taveira <euler@eulerto.com> wrote:

On Mon, Sep 1, 2025, at 7:35 AM, Florents Tselai wrote:
While working on this https://commitfest.postgresql.org/patch/6020/
I discovered that when changing for func/func-aggregate.sgml, the HTML
wasn’t marked for update.

IIUC the doc/Makefile should be updated as attached, right ?


Good catch.

However, your patch doesn't fix all issues. The check target (check-tabs and
check-nbsp) is broken; these targets should also include the func files.


Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
v3 does that.
Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too. 




Do we actually care about those? I don't want to add needless cycles anywhere. I note that the meson.build doesn't appear to have a check target at all, or anything that looks for hard tabs or nbsps.Those checks were added to the Makefile back in October in commit 5b7da5c261d, but that got missed even though Daniel had mentioned it in the discussion thread.[1]


From the message and discussion  in 5b7da5c261d it looks like we do; 
and I've seen some messages here and there that people have indeed trouble applying patches due to spurious whitespace
and special chars. 
So I assume the better solution would be having such checks in meson too,

Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Tue, 2 Sept 2025 at 17:54, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
> v3 does that.
> Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too.
>
> Do we actually care about those? I don't want to add needless cycles anywhere. I note that the meson.build doesn't
appearto have a check target at all, or anything that looks for hard tabs or nbsps.Those checks were added to the
Makefileback in October in commit 5b7da5c261d, but that got missed even though Daniel had mentioned it in the
discussionthread.[1] 

I have been working on running these checks under the Meson build
system. To do this, I converted the checks into a Perl script
(sgml_syntax_check) and ran it against both the Makefile and Meson.
Test's name is 'sgml_syntax_check' in the Meson. One difference I
noticed: I could not find a way in Meson to create a test that does
not run by default. As a result, this syntax test runs every time you
run the 'meson test'. This behaviour differs from Autoconf, but I
think it is acceptable.

Additionally, some of the CI OSes were missing docbook-xml; but it has
now been installed.

I did not create a new thread for that, I can create one if you think
that it would be better.

CI run with the attached patch applied:
https://cirrus-ci.com/build/6610354173640704

--
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-09-12 Fr 10:12 AM, Nazir Bilal Yavuz wrote:
> Hi,
>
> On Tue, 2 Sept 2025 at 17:54, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
>> v3 does that.
>> Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too.
>>
>> Do we actually care about those? I don't want to add needless cycles anywhere. I note that the meson.build doesn't
appearto have a check target at all, or anything that looks for hard tabs or nbsps.Those checks were added to the
Makefileback in October in commit 5b7da5c261d, but that got missed even though Daniel had mentioned it in the
discussionthread.[1]
 
> I have been working on running these checks under the Meson build
> system.


Thanks for this!


> To do this, I converted the checks into a Perl script
> (sgml_syntax_check) and ran it against both the Makefile and Meson.
> Test's name is 'sgml_syntax_check' in the Meson. One difference I
> noticed: I could not find a way in Meson to create a test that does
> not run by default. As a result, this syntax test runs every time you
> run the 'meson test'. This behaviour differs from Autoconf, but I
> think it is acceptable.


Yes, I think so too.


>
> Additionally, some of the CI OSes were missing docbook-xml; but it has
> now been installed.
>
> I did not create a new thread for that, I can create one if you think
> that it would be better.
>
> CI run with the attached patch applied:
> https://cirrus-ci.com/build/6610354173640704
>

I am away this coming week, will check it out in detail when I return.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-09-12 Fr 10:12 AM, Nazir Bilal Yavuz wrote:
> Hi,
>
> On Tue, 2 Sept 2025 at 17:54, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Ah, you’re right, but then again,  I’d expect ALL_SGML to be used consistently, but it isn't and I didn't check.
>> v3 does that.
>> Note that GENERATED_SGML where'te included in these two targets but I think there's no harm in checking them too.
>>
>> Do we actually care about those? I don't want to add needless cycles anywhere. I note that the meson.build doesn't
appearto have a check target at all, or anything that looks for hard tabs or nbsps.Those checks were added to the
Makefileback in October in commit 5b7da5c261d, but that got missed even though Daniel had mentioned it in the
discussionthread.[1]
 
> I have been working on running these checks under the Meson build
> system. To do this, I converted the checks into a Perl script
> (sgml_syntax_check) and ran it against both the Makefile and Meson.
> Test's name is 'sgml_syntax_check' in the Meson. One difference I
> noticed: I could not find a way in Meson to create a test that does
> not run by default. As a result, this syntax test runs every time you
> run the 'meson test'. This behaviour differs from Autoconf, but I
> think it is acceptable.
>
> Additionally, some of the CI OSes were missing docbook-xml; but it has
> now been installed.
>
> I did not create a new thread for that, I can create one if you think
> that it would be better.
>
> CI run with the attached patch applied:
> https://cirrus-ci.com/build/6610354173640704


Hi Bilal,

This got preempted slightly by Tom's commit 170a8a3f460, but I think 
it's worth doing. I tried to simplify it some. See attached. There 
doesn't seem to me to be any point in using a different set of files for 
the tab tests and the NBSP tests. If we use the same set of files we can 
improve the efficiency easily by opening them only once. Here we just 
look for all the sgml files and all the xsl files and process them all.

WDYT?



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Вложения

Re: split func.sgml to separated individual sgml files

От
Tom Lane
Дата:
Andrew Dunstan <andrew@dunslane.net> writes:
> On 2025-09-12 Fr 10:12 AM, Nazir Bilal Yavuz wrote:
>> Test's name is 'sgml_syntax_check' in the Meson. One difference I
>> noticed: I could not find a way in Meson to create a test that does
>> not run by default. As a result, this syntax test runs every time you
>> run the 'meson test'. This behaviour differs from Autoconf, but I
>> think it is acceptable.

Actually, I've been meaning to complain about the fact that these
checks aren't run by the default Makefile target.  I never remember
that there is a separate "check" target, and even if I did remember
it's mostly useless to me because I always want to look at the
rendered HTML.  So when I'm working on the docs I always just say
"make" in the doc/src/sgml directory.  It'd be helpful, at least to
me, if the default target ran the tabs and nbsp checks.  It already
does run xmllint, so that change could probably be integrated with
what you've done here without too much trouble.

> This got preempted slightly by Tom's commit 170a8a3f460, but I think 
> it's worth doing. I tried to simplify it some. See attached. There 
> doesn't seem to me to be any point in using a different set of files for 
> the tab tests and the NBSP tests. If we use the same set of files we can 
> improve the efficiency easily by opening them only once. Here we just 
> look for all the sgml files and all the xsl files and process them all.

+1 for merging those two checks into one pass, especially if we're
to run them by default.

            regards, tom lane



Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Tue, 30 Sept 2025 at 22:48, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> Hi Bilal,
>
> This got preempted slightly by Tom's commit 170a8a3f460, but I think
> it's worth doing. I tried to simplify it some. See attached. There
> doesn't seem to me to be any point in using a different set of files for
> the tab tests and the NBSP tests. If we use the same set of files we can
> improve the efficiency easily by opening them only once. Here we just
> look for all the sgml files and all the xsl files and process them all.
>
> WDYT?

It looks good to me. I made 2 changes to your patch:

1- Declaration of $line_no is lost, I re-added it.
2- s/.cirrus.tasks,yml/.cirrus.tasks.yml/ in the commit message.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Wed, 1 Oct 2025 at 15:09, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> On Tue, 30 Sept 2025 at 22:48, Andrew Dunstan <andrew@dunslane.net> wrote:
> >
> > Hi Bilal,
> >
> > This got preempted slightly by Tom's commit 170a8a3f460, but I think
> > it's worth doing. I tried to simplify it some. See attached. There
> > doesn't seem to me to be any point in using a different set of files for
> > the tab tests and the NBSP tests. If we use the same set of files we can
> > improve the efficiency easily by opening them only once. Here we just
> > look for all the sgml files and all the xsl files and process them all.
> >
> > WDYT?
>
> It looks good to me. I made 2 changes to your patch:
>
> 1- Declaration of $line_no is lost, I re-added it.
> 2- s/.cirrus.tasks,yml/.cirrus.tasks.yml/ in the commit message.

Two more minor changes that I missed in the v2:

1- I added $line_no and removed $_ from the tab check's warning
message. I think it is better this way, otherwise if the line only
contains tab character; $_ will print an empty looking line.
2- s/Tabsand/Tabs and/

-- 
Regards,
Nazir Bilal Yavuz
Microsoft

Вложения

Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-10-01 We 8:27 AM, Nazir Bilal Yavuz wrote:
> Hi,
>
> On Wed, 1 Oct 2025 at 15:09, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>> On Tue, 30 Sept 2025 at 22:48, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> Hi Bilal,
>>>
>>> This got preempted slightly by Tom's commit 170a8a3f460, but I think
>>> it's worth doing. I tried to simplify it some. See attached. There
>>> doesn't seem to me to be any point in using a different set of files for
>>> the tab tests and the NBSP tests. If we use the same set of files we can
>>> improve the efficiency easily by opening them only once. Here we just
>>> look for all the sgml files and all the xsl files and process them all.
>>>
>>> WDYT?
>> It looks good to me. I made 2 changes to your patch:
>>
>> 1- Declaration of $line_no is lost, I re-added it.
>> 2- s/.cirrus.tasks,yml/.cirrus.tasks.yml/ in the commit message.
> Two more minor changes that I missed in the v2:
>
> 1- I added $line_no and removed $_ from the tab check's warning
> message. I think it is better this way, otherwise if the line only
> contains tab character; $_ will print an empty looking line.
> 2- s/Tabsand/Tabs and/
>

OK, thanks, looks good. How do we go about doing what Tom wants (i.e. 
running the tests by default) under meson. I think in the Makefile we 
could just add it to the html target.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Wed, 1 Oct 2025 at 23:02, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2025-10-01 We 8:27 AM, Nazir Bilal Yavuz wrote:
> > Hi,
> >
> > On Wed, 1 Oct 2025 at 15:09, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> >> On Tue, 30 Sept 2025 at 22:48, Andrew Dunstan <andrew@dunslane.net> wrote:
> >>> Hi Bilal,
> >>>
> >>> This got preempted slightly by Tom's commit 170a8a3f460, but I think
> >>> it's worth doing. I tried to simplify it some. See attached. There
> >>> doesn't seem to me to be any point in using a different set of files for
> >>> the tab tests and the NBSP tests. If we use the same set of files we can
> >>> improve the efficiency easily by opening them only once. Here we just
> >>> look for all the sgml files and all the xsl files and process them all.
> >>>
> >>> WDYT?
> >> It looks good to me. I made 2 changes to your patch:
> >>
> >> 1- Declaration of $line_no is lost, I re-added it.
> >> 2- s/.cirrus.tasks,yml/.cirrus.tasks.yml/ in the commit message.
> > Two more minor changes that I missed in the v2:
> >
> > 1- I added $line_no and removed $_ from the tab check's warning
> > message. I think it is better this way, otherwise if the line only
> > contains tab character; $_ will print an empty looking line.
> > 2- s/Tabsand/Tabs and/
> >
>
> OK, thanks, looks good. How do we go about doing what Tom wants (i.e.
> running the tests by default) under meson. I think in the Makefile we
> could just add it to the html target.

I might be misunderstanding, but these syntax checks already run by
default under meson build with this patch. Would we just need to add
this test to the HTML target in the Makefile?

-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:
On 2025-10-02 Th 2:58 AM, Nazir Bilal Yavuz wrote:
> Hi,
>
> On Wed, 1 Oct 2025 at 23:02, Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>> On 2025-10-01 We 8:27 AM, Nazir Bilal Yavuz wrote:
>>> Hi,
>>>
>>> On Wed, 1 Oct 2025 at 15:09, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
>>>> On Tue, 30 Sept 2025 at 22:48, Andrew Dunstan <andrew@dunslane.net> wrote:
>>>>> Hi Bilal,
>>>>>
>>>>> This got preempted slightly by Tom's commit 170a8a3f460, but I think
>>>>> it's worth doing. I tried to simplify it some. See attached. There
>>>>> doesn't seem to me to be any point in using a different set of files for
>>>>> the tab tests and the NBSP tests. If we use the same set of files we can
>>>>> improve the efficiency easily by opening them only once. Here we just
>>>>> look for all the sgml files and all the xsl files and process them all.
>>>>>
>>>>> WDYT?
>>>> It looks good to me. I made 2 changes to your patch:
>>>>
>>>> 1- Declaration of $line_no is lost, I re-added it.
>>>> 2- s/.cirrus.tasks,yml/.cirrus.tasks.yml/ in the commit message.
>>> Two more minor changes that I missed in the v2:
>>>
>>> 1- I added $line_no and removed $_ from the tab check's warning
>>> message. I think it is better this way, otherwise if the line only
>>> contains tab character; $_ will print an empty looking line.
>>> 2- s/Tabsand/Tabs and/
>>>
>> OK, thanks, looks good. How do we go about doing what Tom wants (i.e.
>> running the tests by default) under meson. I think in the Makefile we
>> could just add it to the html target.
> I might be misunderstanding, but these syntax checks already run by
> default under meson build with this patch. Would we just need to add
> this test to the HTML target in the Makefile?
>

Oh, ok, I missed that about meson. I will adjust the Makefile.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: split func.sgml to separated individual sgml files

От
Nazir Bilal Yavuz
Дата:
Hi,

On Thu, 2 Oct 2025 at 15:27, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> Oh, ok, I missed that about meson. I will adjust the Makefile.

I think there is one more problem that we need to think about. This
test runs when the xmllint is enabled but it also requires docbook
(docbook-xml on some OSes) to be installed, otherwise the test fails
with 'I/O error : Attempt to load network entity
http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd'. I think that
we need to skip this test if the docbook can not be found in the
system. Otherwise that would be a hassle for most of the people and
buildfarm members. What do you think about this?

-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Re: split func.sgml to separated individual sgml files

От
Andrew Dunstan
Дата:


On 2025-10-02 Th 8:52 AM, Nazir Bilal Yavuz wrote:
Hi,

On Thu, 2 Oct 2025 at 15:27, Andrew Dunstan <andrew@dunslane.net> wrote:
Oh, ok, I missed that about meson. I will adjust the Makefile.
I think there is one more problem that we need to think about. This
test runs when the xmllint is enabled but it also requires docbook
(docbook-xml on some OSes) to be installed, otherwise the test fails
with 'I/O error : Attempt to load network entity
http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd'. I think that
we need to skip this test if the docbook can not be found in the
system. Otherwise that would be a hassle for most of the people and
buildfarm members. What do you think about this?


Oops, missed seeing this earlier. Yes, I think we need to skip the test in the meson case. Probably nothing more needed for the Makefile.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: split func.sgml to separated individual sgml files

От
Peter Eisentraut
Дата:
On 01.10.25 22:02, Andrew Dunstan wrote:
> 
(Maybe these discussions could have been in a new thread and not hidden 
under some unrelated thing.)
> OK, thanks, looks good. How do we go about doing what Tom wants (i.e. 
> running the tests by default) under meson. I think in the Makefile we 
> could just add it to the html target.

-html: html-stamp
+html: check html-stamp

This is not a good solution.  This means the html target is never up to 
date.  Compare PostgreSQL 18:

$ make html
make: Nothing to be done for 'html'.
$ make -q html; echo $?
0

And master:

$ make html
perl ...
$ make -q html; echo $?
1

Also, consider the postgres-full.xml target:

# Run validation only once, common to all subsequent targets.  While
# we're at it, also resolve all entities (that is, copy all included
# files into one big file).  This helps tools that don't understand
# vpath builds (such as dbtoepub).
postgres-full.xml: postgres.sgml $(ALL_SGML)
     $(XMLLINT) $(XMLINCLUDE) --output $@ --noent --valid $<

Note that this already does validation.  The way this is structured now 
is that it runs the validation once when you create postgres-full.xml, 
which is than later input into the HTML generation, and then you run the 
validation again, on the already-processed input files, which doesn't 
make any sense.

I suspect what you're really after here is the functionality of the 
check-tabs and check-nbsp targets.  So the new Perl script really just 
has to cover those two and doesn't have to bother with xmllint.  And 
then you just call that script as part of the postgres-full.xml target.




Re: split func.sgml to separated individual sgml files

От
Tom Lane
Дата:
Peter Eisentraut <peter@eisentraut.org> writes:
> I suspect what you're really after here is the functionality of the 
> check-tabs and check-nbsp targets.  So the new Perl script really just 
> has to cover those two and doesn't have to bother with xmllint.  And 
> then you just call that script as part of the postgres-full.xml target.

Yeah, that's what I was imagining: replace the xmllint call in
postgres-full.xml with this new script that will also run the
tab/nbsp checks.

            regards, tom lane