Re: Coccinelle for PostgreSQL development [1/N]: coccicheck.py

Поиск
Список
Период
Сортировка
От Mats Kindahl
Тема Re: Coccinelle for PostgreSQL development [1/N]: coccicheck.py
Дата
Msg-id fe34a709-1a5a-449c-8da6-3189d1ac445c@gmail.com
обсуждение исходный текст
Ответ на Re: Coccinelle for PostgreSQL development [1/N]: coccicheck.py  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Список pgsql-hackers
On 11/3/25 10:55, Bertrand Drouvot wrote:
Hi,

On Thu, Oct 30, 2025 at 02:32:45PM +0100, Mats Kindahl wrote:
Hi all,

Here is an updated set of patches based on the latest HEAD of PostgreSQL.
Thanks for those patches and the initiative! Sorry to be late, I started
to look at Coccinelle "for PostgreSQL" in [1] (to ensure some macros are used)
and saw this thread.

Hi Bertrand and thank you for the comments.

I did not look at the patches in detail, but sharing some thoughts.

I agree that Coccinelle could be/is useful in relation to PostgreSQL development,
but I think that we'd need to determine why it would useful to add the
coccicheck.py script and the new dependencies in autoconf/meson.

Coming back to your points and thinking if it could be used as a tool or integrated
into the build system:

1) Identify and correct bugs in the source code both during development and
review

I agree that makes sense here. Having some bugs detected "automatically" would
be great.

2) Make large-scale changes to the source tree to improve the code based on
new insights

I'm not sure we need to introduce coccicheck.py and add dependencies in
autoconf/meson for this. The developer would need to know that he could use
the Coccinelle and if he already knows then nothing prevents him from using it
in his development tool box.

3) Encode and enforce APIs by ensuring that function calls are used
correctly

Same as 2) That said we could also imagine running yearly checks automatically
too using coccicheck.py.

4) Use improved coding patterns for more efficient code

Same as 3) from my point of view.

5) Allow extensions to automatically update code for later PostgreSQL
versions

Same as 2) from my point of view.

So, I think that the current proposal (i.e build system integration) is a good
fit for 1), less so for 3) and 4) and not necessarily needed for 2) and 5).

The proposal will add new dependencies (as Michael stated up-thread) and introduce
a new language (SmPL) that folks would need to be comfortable with to review
the .cocci scripts.

I don't have an answer to it but I think that the main question is: Should we
integrate this into the build system, or just document it as an optional
developer tool (wiki or such and provide .cocci scripts example)?

All of these are good points. My main reason for proposing this is that I think that using Coccinelle more systematically would improve the quality of the code and make it easier to maintain. The build system changes and the coccicheck.py script is a proposal for how to make this happen, but I realize there are alternative ways as well as tradeoffs to be made (e.g., is the extra maintenance burden worth the improvement in quality?)

I think there are three issues here:

  1. Shall the semantic patches be in the source tree?
  2. Shall we use the coccicheck.py script?
  3. Shall we support this in the build system, that is, meson and automake?

Putting the coccicheck.py script aside for a short while, I think there is value in keeping the semantic patches in the source tree rather than adding them to a wiki for the following reasons:

It makes it easy for reviewers to check that the code reviewed follows "best practices" (of course, assuming that the .cocci scripts represent "best practices") instead of having to download and run these scripts from another source. Having them on a separate page would reduce the value since it makes it more difficult to use.

If they are part of the repository the .cocci scripts will be maintained and updated to match the code in the source tree. If they are on a separate wiki page, or a different repository, they are likely to be version dependent and code might change so that they are not relevant any more, which makes it more difficult to use them for reviewing and checking code since that would also require updating them to match the new code before applying them.

A drawback is that it is an extra maintenance burden (probably small, but extra work nevertheless) and would require the semantic patches to be executed regularly, by the build system or by reviewers, and check that they do not report anything. If these checks are not done regularly, it is possible that the scripts would become obsolete after a while. By making it easy to run the scripts, we are more likely to run them and discover issues in the scripts as well as in the code.

About integrating it with the build system. Note that there are three modes: report, context, and (generate) patch. The important modes are report and patch. 

The main advantage of integrating it with the build system is is that it is easy to run, which makes it easy to ensure that the semantic patches do not report anything for both reviewers, developers, and build systems. This will improve the quality of both the semantic patches and the source code.

As you say, it does add an extra dependency, but this should optional and only be used if spatch is installed. For parties not interested in using spatch, hence does not have spatch installed, it will not be used so it should not interfere with them.

Integrating it with the build system would also allow builders and reviewers to use the same method for checking the code using report mode without regard to what version is being used, you would run the checks the same way for each minor release, for example.

As you say, coccicheck.py script is strictly speaking not needed to use Coccinelle and is more intended as a convenience. The advantages of using something like the coccicheck.py script is that:

The coccicheck.py script reads options from the semantic patches and uses them when running spatch. If using spatch directly, you would have to check the scripts, extract the options, and then use them when running spatch. Not impossible, but an extra step that makes it more inconvenient to use Coccinelle, which in turn makes it less likely to be used and maintained.

You could automate the extraction of options from the semantic patches, but this script provides a platform-independent method to run the checks for reviewers, developers, and build systems. Note that different semantic patches can use different options, so if you want to automate this, you would need to write something like coccicheck.py anyway.

Best wishes,
Mats Kindahl


[1]: https://www.postgresql.org/message-id/aQMtR/m4kW4Rkul%2B%40ip-10-97-1-34.eu-west-3.compute.internal

Regards,

В списке pgsql-hackers по дате отправления: