Обсуждение: Diff of this page with other version

Поиск
Список
Период
Сортировка

Diff of this page with other version

От
Marcos Pegoraro
Дата:
I think it is so common to all that it is difficult to see small diffs when navigating through doc pages of different versions of the same page. I'm not talking about completely new doc pages, like merge.sgml from version 14 to 15, but those ones which have small diffs on that file. As an example consider copy.sgml

Version 16 has this text
    NULL '<replaceable class="parameter">null_string</replaceable>'
    DEFAULT '<replaceable class="parameter">default_string</replaceable>'
    HEADER [ <replaceable class="parameter">boolean</replaceable> | MATCH ]
    QUOTE '<replaceable class="parameter">quote_character</replaceable>'

But version 15 has this
    NULL '<replaceable class="parameter">null_string</replaceable>'
    HEADER [ <replaceable class="parameter">boolean</replaceable> | MATCH ]
    QUOTE '<replaceable class="parameter">quote_character</replaceable>'

As you can see, the DEFAULT line was added on version 16, but it is not easy to see what was changed on both versions. 

Another example would be SQL/JSON Path Operators And Methods of func.sgml of version 16 and devel. There are new methods boolean(), decimal(), bigint(), timestamp(), timestamptz() and some others but they are not easy to see that they don't exist in version 16 but would be there when version 17 comes in.

One can say that I have to read release notes before I upgrade a cluster because that page shows all important features and changes between versions. But sometimes this is a small info that just shows a better understanding of that feature or maybe we have several versions running and have doubts of what small feature exists in what version.

So, it would be interesting if we could visually see what was changed on both pages. Then, what I propose is something like you have when using a diff tool,  but in a single page, not side by side. 

An easy way to do that would be add on all changed text a tag like
<span class = "v16" style="background-color:green">Here goes changed or new text</span>

Obviously when a commit is done the committer has to add this span tag to that commit, so that part would be colored with green background only if page 16 is compared with previous ones.

Users would select this option following this: On every doc page you have on title Supported Versions: 16 / 15 / 14 / 13 / 12. Then, if we have on the right of that some options like Compare Version: rb16 / rb15 / rb14 / rb13 / rb12. If these rb16, rb15 are radio buttons, I can compare the actual page with one of previous versions, only one comparison each time. 

This comparison would be only with previous versions. We are on page 13, radio buttons are 13, 12, 11 and so on. You'll never compare 13 with a greater version and if you compare 13 with 13 obviously nothing changes on the appearance of that page.

And to show those changes we need just a small Javascript which will repaint the page we are seeing with those green colors depending on version you are and what you are comparing to.

I know we have to put these tags to all files we already have but this can be done with some regex search tool to do this change to all files.

So, would this change on all doc pages be relevant ?
If you agree with me than we can think if a tool to convert is needed or just a search replace is fine.

The attached image shows what it would do
Вложения

Re: Diff of this page with other version

От
Alvaro Herrera
Дата:
On 2024-Mar-02, Marcos Pegoraro wrote:

> Obviously when a commit is done the committer has to add this span tag to
> that commit, so that part would be colored with green background only if
> page 16 is compared with previous ones.

I think your proposal is a reasonable idea and a very convenient service
for users ... but there's zero chance that committers are going to
accept the additional work and the resulting uglification of the
document source.

If it can be done by post-processing the XML and finding the
differences, to automatically insert some markup that lets the UI show
the differences as you suggest, then we can discuss ways to integrate
that.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"I must say, I am absolutely impressed with what pgsql's implementation of
VALUES allows me to do. It's kind of ridiculous how much "work" goes away in
my code.  Too bad I can't do this at work (Oracle 8/9)."       (Tom Allison)
           http://archives.postgresql.org/pgsql-general/2007-06/msg00016.php



Re: Diff of this page with other version

От
Erik Wienhold
Дата:
On 2024-03-02 15:35 +0100, Marcos Pegoraro wrote:
> I think it is so common to all that it is difficult to see small diffs when
> navigating through doc pages of different versions of the same page. I'm
> not talking about completely new doc pages, like merge.sgml from version 14
> to 15, but those ones which have small diffs on that file. As an example
> consider copy.sgml

+1 for the general idea.

> Version 16 has this text
>     NULL '<replaceable class="parameter">null_string</replaceable>'
>     DEFAULT '<replaceable class="parameter">default_string</replaceable>'
>     HEADER [ <replaceable class="parameter">boolean</replaceable> | MATCH ]
>     QUOTE '<replaceable class="parameter">quote_character</replaceable>'
> 
> But version 15 has this
>     NULL '<replaceable class="parameter">null_string</replaceable>'
>     HEADER [ <replaceable class="parameter">boolean</replaceable> | MATCH ]
>     QUOTE '<replaceable class="parameter">quote_character</replaceable>'
> 
> As you can see, the DEFAULT line was added on version 16, but it is not
> easy to see what was changed on both versions.
> 
> Another example would be SQL/JSON Path Operators And Methods of func.sgml
> of version 16 and devel. There are new methods boolean(), decimal(),
> bigint(), timestamp(), timestamptz() and some others but they are not easy
> to see that they don't exist in version 16 but would be there when version
> 17 comes in.

I think that it's relatively easy to add version info to the <entry>
elements in func.sgml.  But only focusing on additions to the docs
totally misses changes to the text itself which is sometimes reworded
instead of just adding sama paragraphs.

> One can say that I have to read release notes before I upgrade a cluster
> because that page shows all important features and changes between
> versions. But sometimes this is a small info that just shows a better
> understanding of that feature or maybe we have several versions running and
> have doubts of what small feature exists in what version.
> 
> So, it would be interesting if we could visually see what was changed on
> both pages. Then, what I propose is something like you have when using a
> diff tool,  but in a single page, not side by side.
> 
> An easy way to do that would be add on all changed text a tag like
> <span class = "v16" style="background-color:green">Here goes changed or new
> text</span>
> 
> Obviously when a commit is done the committer has to add this span tag to
> that commit, so that part would be colored with green background only if
> page 16 is compared with previous ones.

The HTML is generated from the SGML docs.  So version info has to be
added there.  Or we use some automated tool to get the diff of the
rendered HTML.  W3C has a diff tool[1][2] based on GNU diffutils.  See
[3] for a diff of CREATE TABLE between v15 and v16.

> I know we have to put these tags to all files we already have but this can
> be done with some regex search tool to do this change to all files.

I think you underestimate the effort because the changes are not always
as clear (adding one line) as for copy.sgml as you've shown above.  For
example, if I check create_table.sgml between v15 and v16, I see that
the addition of STORAGE modified an existing line.  So that regex has to
match and wrap just the relevant substring.  When new storage modes are
added we end up with nested version info in order to compare vXX with
pre v16:

    <span class="v16">STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN | <span class="vXX">FOOBAR</span>  | DEFAULT
}</span>

> So, would this change on all doc pages be relevant ?
> If you agree with me than we can think if a tool to convert is needed or
> just a search replace is fine.

I recommend looking into [2] as a proof-of-concept.

[1] https://services.w3.org/htmldiff
[2] https://github.com/w3c/htmldiff-ui
[2]
https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fwww.postgresql.org%2Fdocs%2F15%2Fsql-createtable.html&doc2=https%3A%2F%2Fwww.postgresql.org%2Fdocs%2F16%2Fsql-createtable.html

-- 
Erik



Re: Diff of this page with other version

От
Tom Lane
Дата:
Erik Wienhold <ewie@ewie.name> writes:
> I think you underestimate the effort because the changes are not always
> as clear (adding one line) as for copy.sgml as you've shown above.  For
> example, if I check create_table.sgml between v15 and v16, I see that
> the addition of STORAGE modified an existing line.  So that regex has to
> match and wrap just the relevant substring.  When new storage modes are
> added we end up with nested version info in order to compare vXX with
> pre v16:

Yeah.  I think the chances of getting people to do this in the .sgml
files are precisely zero.  What might have a chance of happening is
to provide a way on the website of running htmldiff or similar tool
over two versions of a doc page on-the-fly and show the results.

            regards, tom lane



Re: Diff of this page with other version

От
Jimmy Angelakos
Дата:

On 02/03/2024 17:33, Tom Lane wrote:

Erik Wienhold <ewie@ewie.name> writes:
I think you underestimate the effort because the changes are not always
as clear (adding one line) as for copy.sgml as you've shown above.  For
example, if I check create_table.sgml between v15 and v16, I see that
the addition of STORAGE modified an existing line.  So that regex has to
match and wrap just the relevant substring.  When new storage modes are
added we end up with nested version info in order to compare vXX with
pre v16:

Yeah.  I think the chances of getting people to do this in the .sgml
files are precisely zero.  What might have a chance of happening is
to provide a way on the website of running htmldiff or similar tool
over two versions of a doc page on-the-fly and show the results.

            regards, tom lane

I agree with Tom, I think automation of this process is the way to go here.

Best regards,
Jimmy

Re: Diff of this page with other version

От
Marcos Pegoraro
Дата:
I do agree, automation is the way. But automation of what, SGML or HTML. 
I think it is better to have an automated way to input some tags on SGML,
but then verify manually if those steps done are correct and then create 
all HTML correctly, instead of running an on-the-fly htmldiff.

Let me explain why some automated HTML diff could not work properly.

1 - Only added text, easy to solve, paint all that line.
PLPGSQL.SGML - Version 16:
<replaceable>variable</replaceable>%TYPE
Version devel:
<replaceable>name</replaceable> <replaceable>table</replaceable>.<replaceable>column</replaceable>%TYPE
<replaceable>name</replaceable> <replaceable>variable</replaceable>%TYPE

2 - Some texts were changed only to a better text, but this time will show the 
entire paragraph or word by word ? Because the change can happen on 
10 or 20 words spread for 5 lines. So it'll not be pretty to have a word 
with style, followed by another without, followed by another with style 
again for several lines. 
ALTER_TABLE.SGML - Version 16:
        <command>CREATE INDEX CONCURRENTLY</command>, and then install it as an
       official
constraint using this syntax.  See the example below.
Version devel:
       <command>CREATE UNIQUE INDEX CONCURRENTLY</command>, and then convert it to a
       constraint using this syntax.  See the example below.

3 - Some texts only changed an internal tag, this time literal by replaceable. 
MERGE.SGML - Version 16:
      <literal>data_source</literal> row.
      If used in a <literal>WHEN NOT MATCHED</literal> clause, the
      expression can use values from the <literal>data_source</literal>.
Version devel:
      <replaceable>data_source</replaceable> row.
      If used in a <literal>WHEN NOT MATCHED</literal> clause, the
      expression can use values from the <replaceable>data_source</replaceable>.

But sometimes it would be better that the committer could choose how it is better 
to emphasise that text. This text is better to show diff only on superuser word or 
the entire paragraph should be shown diff ? 
USER-MANAGE.SGML -  Version 16:
Also note that, because this automatic
   grant is granted by the bootstrap user, it cannot be removed or changed by
   the <literal>CREATEROLE</literal> user;
Version devel:
Also note that, because this automatic
   grant is granted by the bootstrap superuser, it cannot be removed or changed by
   the <literal>CREATEROLE</literal> user;

Some are a mix of new and changed. 
XINDEX.SGML - Version 16:
   GiST indexes have eleven support functions, six of which are optional,
Version devel:
   GiST indexes have twelve support functions, seven of which are optional,
   ...
      <row>
       <entry><function>stratnum</function></entry>
       <entry>translate well-known strategy numbers to ones
        used by the operator class (optional)</entry>
       <entry>12</entry>
      </row>

Some others removed something, so will it add a blank red line to show diff ?
VACUUM.SGML - version 16:
   When the option list is surrounded by parentheses, the options can be
   written in any order.  Without parentheses, options must be specified
   in exactly the order shown above.
version devel:

Some were changed only href
ALTER_PUBLICATION.SGML - version 16:
   <link linkend="sql-createpublication-for-table"><literal>FOR TABLE</literal></link>/
version devel:
   <link linkend="sql-createpublication-params-for-table"><literal>FOR TABLE</literal></link>/

Some were changed only item number on rendered HTML.
functions-json.html - Version 15:
Table 9.49. jsonpath Operators and Methods
Version 16:
Table 9.50. jsonpath Operators and Methods

So, I think the better way is to have a tool to show diffs and give to the committer 
the responsibility to choose which one is better.  
I know this is a huge job, but once it is done the only thing needed is that all 
new commits should be done with this additional DIFF step.
If we think only in supported versions and go changing all from older. All diffs found on 
version 12 should be on version 13 too. All diffs from 12 and 13 should be on 14 and so on
So, if you compare any file with any of its previous supported versions, it would work.

regards
Marcos


Em sáb., 2 de mar. de 2024 às 17:28, Jimmy Angelakos <vyruss@hellug.gr> escreveu:

On 02/03/2024 17:33, Tom Lane wrote:

Erik Wienhold <ewie@ewie.name> writes:
I think you underestimate the effort because the changes are not always
as clear (adding one line) as for copy.sgml as you've shown above.  For
example, if I check create_table.sgml between v15 and v16, I see that
the addition of STORAGE modified an existing line.  So that regex has to
match and wrap just the relevant substring.  When new storage modes are
added we end up with nested version info in order to compare vXX with
pre v16:

Yeah.  I think the chances of getting people to do this in the .sgml
files are precisely zero.  What might have a chance of happening is
to provide a way on the website of running htmldiff or similar tool
over two versions of a doc page on-the-fly and show the results.

            regards, tom lane

I agree with Tom, I think automation of this process is the way to go here.

Best regards,
Jimmy

Re: Diff of this page with other version

От
Daniel Gustafsson
Дата:
> On 2 Mar 2024, at 15:44, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

> I think your proposal is a reasonable idea and a very convenient service
> for users ... but there's zero chance that committers are going to
> accept the additional work and the resulting uglification of the
> document source.
> 
> If it can be done by post-processing the XML and finding the
> differences, to automatically insert some markup that lets the UI show
> the differences as you suggest, then we can discuss ways to integrate
> that.

I agree with all of the above.

--
Daniel Gustafsson




Re: Diff of this page with other version

От
Marcos Pegoraro
Дата:
Em dom., 3 de mar. de 2024 às 17:26, Alvaro Herrera <alvherre@alvh.no-ip.org> escreveu:

I think your proposal is a reasonable idea and a very convenient service
for users ... but there's zero chance that committers are going to
accept the additional work and the resulting uglification of the
document source.

If it can be done by post-processing the XML and finding the
differences, to automatically insert some markup that lets the UI show
the differences as you suggest, then we can discuss ways to integrate
that.

I understand your point but cannot imagine a tool that does all those points I've detailed.
And about uglification of document source, what I propose is just another internal tag 
for each version. same way you use <command>, <literal>, etc, you would use <v15>, 
<v16> and so on, just that. So I don't think it is a uglification of document source.

Version 15:
  <para>
   You must own the schema to use <command>ALTER SCHEMA</command>.
   To rename a schema you must also have the
   <literal>CREATE</literal> privilege for the database.
   To alter the owner, you must also be a direct or
   indirect member of the new owning role, and you must have the
   <literal>CREATE</literal> privilege for the database.
   (Note that superusers have all these privileges automatically.)
  </para>
 </refsect1>
Version 16:
  <para><v16>
   You must own the schema to use <command>ALTER SCHEMA</command>.
   To rename a schema you must also have the
   <literal>CREATE</literal> privilege for the database.
   To alter the owner, you must be able to <literal>SET ROLE</literal> to the
   new owning role, and that role must have the
   <literal>CREATE</literal> privilege for the database.
   (Note that superusers have all these privileges automatically.)
  </v16></para>
 </refsect1>

Re: Diff of this page with other version

От
Daniel Gustafsson
Дата:
> On 4 Mar 2024, at 14:11, Marcos Pegoraro <marcos@f10.com.br> wrote:

> I understand your point but cannot imagine a tool that does all those points I've detailed.

Googling for "diff two webpages", taking the first hit and plugging in the
CREATE TABLE page you used as an example gave me a comparison which pulled out
the STORAGE changes very clearly.  It didn't look exactly like your mockups,
but it did the job.

> And about uglification of document source, what I propose is just another internal tag
> for each version. same way you use <command>, <literal>, etc, you would use <v15>,
> <v16> and so on, just that. So I don't think it is a uglification of document source.

Adding versionspecific tags to the docs seems like quite the backpatching
hazard, which makes this a tough sell.

--
Daniel Gustafsson


Re: Diff of this page with other version

От
"Euler Taveira"
Дата:
On Mon, Mar 4, 2024, at 10:31 AM, Daniel Gustafsson wrote:
> On 4 Mar 2024, at 14:11, Marcos Pegoraro <marcos@f10.com.br> wrote:

> I understand your point but cannot imagine a tool that does all those points I've detailed.

Googling for "diff two webpages", taking the first hit and plugging in the
CREATE TABLE page you used as an example gave me a comparison which pulled out
the STORAGE changes very clearly.  It didn't look exactly like your mockups,
but it did the job.

There is also tools like pgPedia [1] that provides a command / function
history. Unless there is a way to automate this request as Alvaro said, it
would impose extra work by adding tags to mark that a feature adds a keyword
and/or value. What happen if a deprecated keyword / value was removed? The
synopsis content can be rearranged too. I have a gut feeling that some work
should be done manually.



--
Euler Taveira

Re: Diff of this page with other version

От
Tom Lane
Дата:
"Euler Taveira" <euler@eulerto.com> writes:
> There is also tools like pgPedia [1] that provides a command / function
> history. Unless there is a way to automate this request as Alvaro said, it
> would impose extra work by adding tags to mark that a feature adds a keyword
> and/or value. What happen if a deprecated keyword / value was removed? The
> synopsis content can be rearranged too. I have a gut feeling that some work
> should be done manually.

There's no doubt that we could get *better* results with the addition
of manual effort.  But even htmldiff as it stands today produces
*usable* results (I tried it on a couple of versions of the COPY
reference page).  Given that we've not offered something like this
at all before, I think it's silly to put in a lot of up-front effort
in advance of seeing what the demand really is.

In any case, I concur with the other committers that there is simply
no way we're going to accept manual change markup as an additional
expectation for documentation commits.  It's hard enough already.

            regards, tom lane



Re: Diff of this page with other version

От
Marcos Pegoraro
Дата:
Em seg., 4 de mar. de 2024 às 10:32, Daniel Gustafsson <daniel@yesql.se> escreveu:

Googling for "diff two webpages", taking the first hit and plugging in the
CREATE TABLE page you used as an example gave me a comparison which pulled out
the STORAGE changes very clearly.  It didn't look exactly like your mockups,
but it did the job.

Probably you got a page with minimal changes. 
Try json functions and compare devel with version 12 and you'll see a mess

Obviously I would love to have a small tool with a huge benefit, 
but I really doubt it's possible without manual changes.

regards
Marcos

Re: Diff of this page with other version

От
Tom Lane
Дата:
Marcos Pegoraro <marcos@f10.com.br> writes:
> Probably you got a page with minimal changes.
> Try json functions and compare devel with version 12 and you'll see a mess

[ shrug... ]  No diff extending across commit e894c6183 and followups
is going to produce anything terribly helpful ... or if you think that
is possible, please show how you would have marked it up.

            regards, tom lane



Re: Diff of this page with other version

От
Marcos Pegoraro
Дата:
Em seg., 4 de mar. de 2024 às 12:12, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
 
Given that we've not offered something like this
at all before, I think it's silly to put in a lot of up-front effort
in advance of seeing what the demand really is.

Well, only saying that I brought to this discussion several committers 
is proof of how beneficial this feature would be.
 
In any case, I concur with the other committers that there is simply
no way we're going to accept manual change markup as an additional
expectation for documentation commits.  It's hard enough already.

I understand that every change on docs would have this additional step,
but I don't think it is a huge step. And talking about additional steps, 
compare how many tags exist on release notes of version 8.0 and 
release notes of version 16.0. 


Re: Diff of this page with other version

От
Tom Lane
Дата:
Marcos Pegoraro <marcos@f10.com.br> writes:
> Em seg., 4 de mar. de 2024 às 12:12, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
>> Given that we've not offered something like this
>> at all before, I think it's silly to put in a lot of up-front effort
>> in advance of seeing what the demand really is.

> Well, only saying that I brought to this discussion several committers
> is proof of how beneficial this feature would be.

Nonsense.  The committers who have commented have all said that it'd
be a totally unacceptable amount of work for dubious gain.

            regards, tom lane