Обсуждение: Differential Code Coverage report for Postgres

Поиск
Список
Период
Сортировка

Differential Code Coverage report for Postgres

От
Nazir Bilal Yavuz
Дата:
Hi,

I have been working on generating differential code coverage for
Postgres and was able to do so with this script [1]. The script checks
out HEAD and the latest release branch (currently REL_18_STABLE), then
generates a differential coverage report.

I also set up a GitHub Action so the report is updated daily and
published as HTML here [2].

I thought this might be useful to share, and I would be happy to hear
any feedback or suggestions.

CC’ing Álvaro since he had asked about this previously.

[1] https://github.com/nbyavuz/postgres-code-coverage/blob/main/code_coverage.sh
[2] https://nbyavuz.github.io/postgres-code-coverage

--
Regards,
Nazir Bilal Yavuz
Microsoft



回复: Differential Code Coverage report for Postgres

От
Oreo Yang
Дата:
It looks very cool.
So our goal over 90%?

Thanks,
OreoYang

发件人: Nazir Bilal Yavuz <byavuz81@gmail.com>
已发送: 2025 年 9 月 5 日 星期五 15:09
收件人: PostgreSQL Hackers <pgsql-hackers@lists.postgresql.org>
抄送: Andres Freund <andres@anarazel.de>; Álvaro Herrera <alvherre@kurilemu.de>
主题: Differential Code Coverage report for Postgres

Hi,

I have been working on generating differential code coverage for
Postgres and was able to do so with this script [1]. The script checks
out HEAD and the latest release branch (currently REL_18_STABLE), then
generates a differential coverage report.

I also set up a GitHub Action so the report is updated daily and
published as HTML here [2].

I thought this might be useful to share, and I would be happy to hear
any feedback or suggestions.

CC’ing Álvaro since he had asked about this previously.

[1] https://github.com/nbyavuz/postgres-code-coverage/blob/main/code_coverage.sh
[2] https://nbyavuz.github.io/postgres-code-coverage

--
Regards,
Nazir Bilal Yavuz
Microsoft


Re: Differential Code Coverage report for Postgres

От
Nazir Bilal Yavuz
Дата:
Hi,

On Fri, 5 Sept 2025 at 13:39, Oreo Yang <oreo.yang@hotmail.com> wrote:
>
> It looks very cool.

Thanks!

> So our goal over 90%?

I am not sure of that but of course the higher the better.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Differential Code Coverage report for Postgres

От
Jacob Champion
Дата:
On Fri, Sep 5, 2025 at 12:09 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> I have been working on generating differential code coverage for
> Postgres and was able to do so with this script [1]. The script checks
> out HEAD and the latest release branch (currently REL_18_STABLE), then
> generates a differential coverage report.

This is fantastic timing! Differential coverage will be incredibly
useful to have for some upcoming test patches I am writing. :) I will
take a look.

I think LCOV's display table is a bit confusing, especially in how
they've chosen to include unchanged code in their percentage count for
the differential, but I suppose I'll get used to it.

On Fri, Sep 5, 2025 at 3:39 AM Oreo Yang <oreo.yang@hotmail.com> wrote:
> It looks very cool.
> So our goal over 90%?

I think our goal should be to pin behavior that needs to be pinned,
and use coverage as a helpful indicator as to what is missing.

(There are many ways to write low-quality tests that cover a lot of
code. I don't think we should be chasing numbers; we should be saying
"oh, 25% in this section is really bad" and then remedying that. If
you do that over and over again, you get really high coverage numbers
organically as a result of your higher-quality tests.)

Thanks,
--Jacob



Re: Differential Code Coverage report for Postgres

От
Andres Freund
Дата:
Hi,

On 2025-09-05 10:09:27 +0300, Nazir Bilal Yavuz wrote:
> I have been working on generating differential code coverage for
> Postgres and was able to do so with this script [1]. The script checks
> out HEAD and the latest release branch (currently REL_18_STABLE), then
> generates a differential coverage report.

I wonder if it'd be better to compare the merge base between master and
REL_18_STABLE, rather than REL_18_STABLE [1].


> I also set up a GitHub Action so the report is updated daily and
> published as HTML here [2].

Nice!

How hard would it be to compare not just REL_18_STABLE and master, but also
REL_17_STABLE and REL_18_STABLE?

Greetings,

Andres Freund

[1] git merge-base master REL_18_STABLE



Re: Differential Code Coverage report for Postgres

От
Nazir Bilal Yavuz
Дата:
Hi,

On Fri, 5 Sept 2025 at 18:14, Andres Freund <andres@anarazel.de> wrote:
>
> On 2025-09-05 10:09:27 +0300, Nazir Bilal Yavuz wrote:
> > I have been working on generating differential code coverage for
> > Postgres and was able to do so with this script [1]. The script checks
> > out HEAD and the latest release branch (currently REL_18_STABLE), then
> > generates a differential coverage report.
>
> I wonder if it'd be better to compare the merge base between master and
> REL_18_STABLE, rather than REL_18_STABLE [1].

This looks interesting, I did not know that. I will check this out.

> > I also set up a GitHub Action so the report is updated daily and
> > published as HTML here [2].
>
> Nice!
>
> How hard would it be to compare not just REL_18_STABLE and master, but also
> REL_17_STABLE and REL_18_STABLE?

There are two things:

1- One Github Actions run takes ~50 minutes for now and since this
runs daily it is ~1500 minutes in total for a month. If you include
manual triggers and failures, it is more than 1500 minutes. Github
allows 2000 minutes free usage of Github Actions for a month. So, if
we increase the time (by generating another report), then it may
exceed the free usage limit. Right now, I install Postgres
dependencies on each task; I will work on it to reduce time.

2- If we want to show both reports on the same page, then it may
require a bit of HTML coding. I have no experience on that but I do
not think it will be hard.

-- 
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Differential Code Coverage report for Postgres

От
Nazir Bilal Yavuz
Дата:
Hi,

On Fri, 5 Sept 2025 at 18:14, Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
>
> On Fri, Sep 5, 2025 at 12:09 AM Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> > I have been working on generating differential code coverage for
> > Postgres and was able to do so with this script [1]. The script checks
> > out HEAD and the latest release branch (currently REL_18_STABLE), then
> > generates a differential coverage report.
>
> This is fantastic timing! Differential coverage will be incredibly
> useful to have for some upcoming test patches I am writing. :) I will
> take a look.

I hope it helps! I updated the README on the repository to show how
you can run it on your local.

--
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Differential Code Coverage report for Postgres

От
Álvaro Herrera
Дата:
Thanks for working on this!

On 2025-Sep-05, Nazir Bilal Yavuz wrote:

> 1- One Github Actions run takes ~50 minutes for now and since this
> runs daily it is ~1500 minutes in total for a month. If you include
> manual triggers and failures, it is more than 1500 minutes. Github
> allows 2000 minutes free usage of Github Actions for a month. So, if
> we increase the time (by generating another report), then it may
> exceed the free usage limit. Right now, I install Postgres
> dependencies on each task; I will work on it to reduce time.

I think the goal should be to run this on the pginfra machines.  I
wasn't really thinking about doing this until pginfra upgraded to Debian
trixie, because that would have the lcov version we need; but since you
also seem to be cloning lcov, maybe we do that also in pginfra and thus
we could do it right away.  Such a machine can use all the CPU time it
needs.  (In fact, in a totally overkill approach, we currently run the
report every four hours or something like that.  It's easy to run once
or twice daily and run all branches instead.)

> 2- If we want to show both reports on the same page, then it may
> require a bit of HTML coding. I have no experience on that but I do
> not think it will be hard.

I think it's enough to have multiple reports available and a link to
each.

I see that the report shows 7, 30, 360 days of change.  I wonder how
that works, and how can we best make use of that.  Are you supposed to
run the tests every day and then run the diff with the tests from one
week, one month, one year ago?   Or is the script running the test for
7, 30, 360 days ago every time and comparing those with the current one?

Maybe what we want is not "x days ago" but instead compare current
branch HEAD with each previous minor release (up to the merge-base with
master).  Does genhtml let you do that?  If not, "X days ago" is
probably good enough.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: Differential Code Coverage report for Postgres

От
Nazir Bilal Yavuz
Дата:
Hi,

On Fri, 5 Sept 2025 at 22:14, Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> Thanks for working on this!
>
> On 2025-Sep-05, Nazir Bilal Yavuz wrote:
>
> > 1- One Github Actions run takes ~50 minutes for now and since this
> > runs daily it is ~1500 minutes in total for a month. If you include
> > manual triggers and failures, it is more than 1500 minutes. Github
> > allows 2000 minutes free usage of Github Actions for a month. So, if
> > we increase the time (by generating another report), then it may
> > exceed the free usage limit. Right now, I install Postgres
> > dependencies on each task; I will work on it to reduce time.
>
> I think the goal should be to run this on the pginfra machines.  I
> wasn't really thinking about doing this until pginfra upgraded to Debian
> trixie, because that would have the lcov version we need; but since you
> also seem to be cloning lcov, maybe we do that also in pginfra and thus
> we could do it right away.  Such a machine can use all the CPU time it
> needs.  (In fact, in a totally overkill approach, we currently run the
> report every four hours or something like that.  It's easy to run once
> or twice daily and run all branches instead.)

I agree with this. I cloned lcov because of the reason you wrote
above, making it work on more machines.

>
> > 2- If we want to show both reports on the same page, then it may
> > require a bit of HTML coding. I have no experience on that but I do
> > not think it will be hard.
>
> I think it's enough to have multiple reports available and a link to
> each.
>
> I see that the report shows 7, 30, 360 days of change.  I wonder how
> that works, and how can we best make use of that.  Are you supposed to
> run the tests every day and then run the diff with the tests from one
> week, one month, one year ago?   Or is the script running the test for
> 7, 30, 360 days ago every time and comparing those with the current one?

I think it works like that, we run the tests once for master and
REL_18_STABLE; then genhtml calculates the `current commit's day
(which is the latest commit at the master) - related commit's day` and
puts the commit in the correct date bin. This is automatically done by
genhtml. genhtml binary accepts the '--date-bins' option and it is set
to '--date-bins 1,7,30,360' in the script.

> Maybe what we want is not "x days ago" but instead compare current
> branch HEAD with each previous minor release (up to the merge-base with
> master).  Does genhtml let you do that?  If not, "X days ago" is
> probably good enough.

I do not think so. I think if we want to do that, we need to run tests
and generate lcov reports of each minor release and run a genhtml for
each minor release's lcov files along with the master's lcov file.

--
Regards,
Nazir Bilal Yavuz
Microsoft