Обсуждение: Re: IWYU annotations
Peter Eisentraut <peter@eisentraut.org> writes: > I have done a pass over much of the source code with > include-what-you-use (IWYU) to remove superfluous includes (commits > dbbca2cf299, 9be4e5d293b, ecb5af77987). Along the way I have collected > some pragma annotations to deal with exceptions and special cases and > peculiarities of the PostgreSQL source code header structures (see [0] > for description). Here I'm proposing a set of patches to add such > annotations in commonly useful cases that should deal with most of the > noise. This seems to be going in the direction that there will be Yet Another tool that committers have to know everything about in order to not commit bad code. I'm feeling resistant to that, mainly because I'm far from convinced that IWYU brings us enough value to justify everybody having to learn about it. (The fact that the predecessor tool pgrminclude hasn't been used in a dozen years, and nobody seems to care, speaks volumes here.) In particular, this patchset introduces what seem like very error-prone setups, such as in rmgrdesc.c where there's now one group of #include's with "pragma: begin_keep/pragma: end_keep" around it and another group without. Most of us are likely to just blindly stick a new #include into alphabetical order somewhere in there and not notice that there's now an additional concern. The fact that that you've added precisely zero documentation about what these pragmas are doesn't help. regards, tom lane
On 09.12.24 17:37, Tom Lane wrote: > Peter Eisentraut <peter@eisentraut.org> writes: >> I have done a pass over much of the source code with >> include-what-you-use (IWYU) to remove superfluous includes (commits >> dbbca2cf299, 9be4e5d293b, ecb5af77987). Along the way I have collected >> some pragma annotations to deal with exceptions and special cases and >> peculiarities of the PostgreSQL source code header structures (see [0] >> for description). Here I'm proposing a set of patches to add such >> annotations in commonly useful cases that should deal with most of the >> noise. > > This seems to be going in the direction that there will be Yet Another > tool that committers have to know everything about in order to not > commit bad code. I'm feeling resistant to that, mainly because I'm > far from convinced that IWYU brings us enough value to justify > everybody having to learn about it. It's not realistic at the moment for it to be a tool that everyone needs to use and everyone needs to keep 100% clean. We're certainly very far from that being feasible at the moment. I see it useful in two areas: First, when you add new files or move lots of code around, you can have it provide an informed opinion about what includes to keep and add. Because in practice that's mostly a crapshoot nowadays. An example from a current patch under review: $ iwyu_tool.py -p build src/backend/libpq/auth-oauth.c ... ../src/backend/libpq/auth-oauth.c should remove these lines: - #include <fcntl.h> // lines 19-19 - #include <unistd.h> // lines 18-18 - #include "storage/fd.h" // lines 28-28 Second, clangd (the language server) has support for this also, and so depending on local configuration and preferences, it can highlight missing or redundant includes or even add some automatically as you edit. (The latter obviously needs some manual supervision, but it is arguably kind of neat that you don't need to jump to the top and manually add includes as you type in new code that needs an additional header.) But in order for either of this to work, it needs to have some information about basic PostgreSQL code conventions. Otherwise, it will also do this: ../src/backend/libpq/auth-oauth.c should add these lines: ... #include <stdbool.h> // for false, bool, true #include <stddef.h> // for NULL, size_t #include "c.h" // for Assert, explicit_bzero, pg_strncasecmp ... because it doesn't know that the convention is that you are supposed to include "postgres.h" (or one of the other always-first headers) and then everything that it brings it should usually not be included again directly. (Or conversely in some cases it will suggest to remove the include of "postgres.h" because it doesn't provide anything of use.) So right now you get a bunch of noise and misleading information for each file and the whole clangd support is of limited use. That's what my patch set is mainly trying to address. > In particular, this patchset introduces what seem like very > error-prone setups, such as in rmgrdesc.c where there's now one > group of #include's with "pragma: begin_keep/pragma: end_keep" > around it and another group without. Most of us are likely > to just blindly stick a new #include into alphabetical order > somewhere in there and not notice that there's now an additional > concern. The fact that that you've added precisely zero > documentation about what these pragmas are doesn't help. It's a fair point that some documentation could be provided. I suppose we don't want to verbosely explain each pragma individually. Should there be some central explanation, maybe in src/tools/pginclude/README? Note that if you google like "IWYU pragma: export" it will take you to the upstream documentation as the first hit, so the full documentation is pretty easy to find.
Peter Eisentraut <peter@eisentraut.org> writes: > On 09.12.24 17:37, Tom Lane wrote: >> In particular, this patchset introduces what seem like very >> error-prone setups, such as in rmgrdesc.c where there's now one >> group of #include's with "pragma: begin_keep/pragma: end_keep" >> around it and another group without. Most of us are likely >> to just blindly stick a new #include into alphabetical order >> somewhere in there and not notice that there's now an additional >> concern. The fact that that you've added precisely zero >> documentation about what these pragmas are doesn't help. > It's a fair point that some documentation could be provided. I suppose > we don't want to verbosely explain each pragma individually. Should > there be some central explanation, maybe in src/tools/pginclude/README? That might do, but perhaps instead in the "PostgreSQL Coding Conventions" chapter of the SGML docs? Actually, I think we could do with a centralized explanation of our inclusion conventions --- I'm not sure that the whole business of "postgres.h or a sibling must be first" is explained in any easy-to-find place. This topic would likely fit well with such an explanation. But really, the point I was trying to make above is that I don't want this to break our very long-standing convention that headers should be #include'd alphabetically and there is never a need to impose some other order (at least not without lots of commentary about it at the scene of the crime). The way you've done it here is just asking for trouble, IMO. If that means redundant pragma commands, so be it. regards, tom lane
On 10.01.25 09:10, Peter Eisentraut wrote: > On 02.01.25 17:15, Tom Lane wrote: >>> It's a fair point that some documentation could be provided. I suppose >>> we don't want to verbosely explain each pragma individually. Should >>> there be some central explanation, maybe in src/tools/pginclude/README? >> >> That might do, but perhaps instead in the "PostgreSQL Coding >> Conventions" chapter of the SGML docs? Actually, I think we could do >> with a centralized explanation of our inclusion conventions --- I'm >> not sure that the whole business of "postgres.h or a sibling must be >> first" is explained in any easy-to-find place. This topic would >> likely fit well with such an explanation. > > In this updated patch set, I've just added a bit of info into src/tools/ > pginclude/README. Updating the coding convention documentation like you > suggest also sounds like a good idea, but from my perspective I would > need to do further research to pin down what those actually are, so I'm > leaving that as a separate project. > > (See for example the business in src/tools/pginclude/headerscheck about > guessing which the appropriate first header file should be for each > component. That kind of thing perhaps ought to be more formalized.) > >> But really, the point I was trying to make above is that I don't >> want this to break our very long-standing convention that headers >> should be #include'd alphabetically and there is never a need to >> impose some other order (at least not without lots of commentary >> about it at the scene of the crime). The way you've done it here >> is just asking for trouble, IMO. If that means redundant pragma >> commands, so be it. > > Yeah, that's a fair point. I have removed that part from my patch set. I have committed this. Thanks for the feedback.
On Wed, Jan 15, 2025 at 2:21 PM Peter Eisentraut <peter@eisentraut.org> wrote: > I have committed this. Thanks for the feedback. Thanks for working on this. Is it worth documenting how to get clangd working sensibly with "--header-insertion=iwyu"? I enabled it just now. I find that I can now use completion with a type name, and have clangd automatically include the necessary header file. For example, if I use completion to pull in a "StringInfoData" I find that clangd will automatically add the necessary #include "lib/stringinfo.h", without any special directions from me -- no context switching required. Importantly, now that we have these annotations, clangd won't constantly be adding useless includes that actually come from c.h -- that made IWYU header cleaning unusable before now. Note that I'm relying on my clang-tidy config for this. clangd is able to respect our conventions around headers through clang-tidy, which has been taught Postgres-specific rules around header conventions. So we'd actually be documenting something about clang-tidy + clangd + Postgres header conventions, if we went through with something like this. It might be worth holding off for now. It's possible that I'll find --header-insertion=iwyu has big problems in some unforeseen way. But offhand it looks like a real improvement, even though I foresee certain minor downsides. What do you think? -- Peter Geoghegan
On 15.01.25 20:51, Peter Geoghegan wrote: > On Wed, Jan 15, 2025 at 2:21 PM Peter Eisentraut <peter@eisentraut.org> wrote: >> I have committed this. Thanks for the feedback. > > Thanks for working on this. > > Is it worth documenting how to get clangd working sensibly with > "--header-insertion=iwyu"? > > I enabled it just now. I find that I can now use completion with a > type name, and have clangd automatically include the necessary header > file. For example, if I use completion to pull in a "StringInfoData" I > find that clangd will automatically add the necessary #include > "lib/stringinfo.h", without any special directions from me -- no > context switching required. Importantly, now that we have these > annotations, clangd won't constantly be adding useless includes that > actually come from c.h -- that made IWYU header cleaning unusable > before now. > > Note that I'm relying on my clang-tidy config for this. clangd is able > to respect our conventions around headers through clang-tidy, which > has been taught Postgres-specific rules around header conventions. So > we'd actually be documenting something about clang-tidy + clangd + > Postgres header conventions, if we went through with something like > this. > > It might be worth holding off for now. It's possible that I'll find > --header-insertion=iwyu has big problems in some unforeseen way. But > offhand it looks like a real improvement, even though I foresee > certain minor downsides. What do you think? I had turned automatic header insertion off until now and didn't think to turn it back on yet. I'll give it another try sometime. I'm unclear on what role clang-tidy would play in this.
On Wed, Jan 22, 2025 at 9:08 AM Peter Eisentraut <peter@eisentraut.org> wrote: > > It might be worth holding off for now. It's possible that I'll find > > --header-insertion=iwyu has big problems in some unforeseen way. But > > offhand it looks like a real improvement, even though I foresee > > certain minor downsides. What do you think? > > I had turned automatic header insertion off until now and didn't think > to turn it back on yet. I'll give it another try sometime. I'm now a week into this experiment. So far, the results have been mixed. I'm tempted to turn --header-insertion=iwyu off now, since clangd header insertion with completion is in fact still adding what seem to me to be superfluous includes in some cases (though much less so compared to before your recent work). So it's still useful, but it might be more annoying than useful. Particularly compared to my original approach of relying on clangd errors + quick-fix actions to add required headers -- that doesn't have these problems (I seem to only be offered the option of a quick-fix when I get certain compile errors), and requires only minimal effort. > I'm unclear on what role clang-tidy would play in this. Sorry, I meant clang-format. The relevant section of my .clang-format file: SortIncludes: true IncludeIsMainSourceRegex: '(postgres\.h)$' IncludeCategories: - Regex: '^<.*\.h>' Priority: 1 SortPriority: 1 - Regex: 'postgres.h' Priority: 2 SortPriority: 2 - Regex: '.*' Priority: 3 SortPriority: 3 I'm guessing that you already have something like this (I believe I copied it from somebody else). This config is respected by clangd whenever it adds a header file (whether it's fully automatic or whether it's via a "missing header" quick-fix). -- Peter Geoghegan