Обсуждение: pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

Поиск
Список
Период
Сортировка

pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

От
Andres Freund
Дата:
ecpg: Output dir, source dir, stamp file argument for preproc/*.pl

This is in preparation for building postgres with meson / ninja.

When building with meson, commands are run at the root of the build tree. Add
an option to put build output into the appropriate place. This can be utilized
by src/tools/msvc/ for a minor simplification, which also provides some
coverage for the new option.

Add option to generate a timestamp for check_rules.pl, so that proper
dependencies on it having been run can be generated.

Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Discussion: https://postgr.es/m/5e216522-ba3c-f0e6-7f97-5276d0270029@enterprisedb.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/db0a272d123b8d7f4d4acbeb54f27682a566be83

Modified Files
--------------
src/interfaces/ecpg/preproc/Makefile       |  4 ++--
src/interfaces/ecpg/preproc/check_rules.pl | 24 ++++++++++++++++-------
src/interfaces/ecpg/preproc/parse.pl       | 31 +++++++++++++++++++++---------
src/tools/msvc/Solution.pm                 |  5 ++---
4 files changed, 43 insertions(+), 21 deletions(-)


Re: pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

От
Andres Freund
Дата:
Hi,

On 2022-07-18 19:56:18 +0000, Andres Freund wrote:
> ecpg: Output dir, source dir, stamp file argument for preproc/*.pl
> 
> This is in preparation for building postgres with meson / ninja.
> 
> When building with meson, commands are run at the root of the build tree. Add
> an option to put build output into the appropriate place. This can be utilized
> by src/tools/msvc/ for a minor simplification, which also provides some
> coverage for the new option.
> 
> Add option to generate a timestamp for check_rules.pl, so that proper
> dependencies on it having been run can be generated.

> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
> Discussion: https://postgr.es/m/5e216522-ba3c-f0e6-7f97-5276d0270029@enterprisedb.com

Crake is relaying a perlcritic error:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2022-07-18%2020%3A36%3A30
Jul 18 16:48:25 ./src/interfaces/ecpg/preproc/parse.pl: Bareword file handle opened at line 30, column 1.  See pages
202,204of PBP.  ([InputOutput::ProhibitBarewordFileHandles] Severity: 5)
 
Jul 18 16:48:25 ./src/interfaces/ecpg/preproc/parse.pl: Bareword file handle opened at line 31, column 1.  See pages
202,204of PBP.  ([InputOutput::ProhibitBarewordFileHandles] Severity: 5)
 

afaict that's bogus. It's unnecessary that the code uses "our" instead of
"my", but there's no bareword there and replacing our with my fixes that
complaint.

Am I misunderstanding something here?

Greetings,

Andres Freund



Re: pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> ecpg: Output dir, source dir, stamp file argument for preproc/*.pl

crake says that perlcritic doesn't like this:

Jul 18 16:48:25 ./src/interfaces/ecpg/preproc/parse.pl: Bareword file handle opened at line 30, column 1.  See pages
202,204of PBP.  ([InputOutput::ProhibitBarewordFileHandles] Severity: 5) 
Jul 18 16:48:25 ./src/interfaces/ecpg/preproc/parse.pl: Bareword file handle opened at line 31, column 1.  See pages
202,204of PBP.  ([InputOutput::ProhibitBarewordFileHandles] Severity: 5) 

That's bitching about

open(our $parserfh, '<', $parser) or die "could not open parser file $parser";
open(our $outfh, '>', $outfile) or die "could not open output file $outfile";

which doesn't look like a bareword to my non-perl-monk eye,
but what do I know?  Maybe you should be using "my" instead
of "our", because I don't see anything else about it that looks
different from other places in our code.

            regards, tom lane



Re: pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> Jul 18 16:48:25 ./src/interfaces/ecpg/preproc/parse.pl: Bareword file handle opened at line 30, column 1.  See pages
202,204of PBP.  ([InputOutput::ProhibitBarewordFileHandles] Severity: 5) 
> Jul 18 16:48:25 ./src/interfaces/ecpg/preproc/parse.pl: Bareword file handle opened at line 31, column 1.  See pages
202,204of PBP.  ([InputOutput::ProhibitBarewordFileHandles] Severity: 5) 

> afaict that's bogus. It's unnecessary that the code uses "our" instead of
> "my", but there's no bareword there and replacing our with my fixes that
> complaint.

Ah, I think I've got it.  Per "man perlfunc":

           "our" has the same scoping rules as "my" or "state", meaning that
           it is only valid within a lexical scope.  Unlike "my" and "state",
           which both declare new (lexical) variables, "our" only creates an
           alias to an existing variable: a package variable of the same name.

Since there's not actually any such package variable, the net effect is
that the first argument of open() is an undeclared name.  I can more or
less see why perl might treat that the same as a bareword, though I
definitely agree that this error message is more confusing than helpful.

            regards, tom lane



Re: pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

От
Andres Freund
Дата:
Hi,

On 2022-07-18 17:27:21 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Jul 18 16:48:25 ./src/interfaces/ecpg/preproc/parse.pl: Bareword file handle opened at line 30, column 1.  See
pages202,204 of PBP.  ([InputOutput::ProhibitBarewordFileHandles] Severity: 5)
 
> > Jul 18 16:48:25 ./src/interfaces/ecpg/preproc/parse.pl: Bareword file handle opened at line 31, column 1.  See
pages202,204 of PBP.  ([InputOutput::ProhibitBarewordFileHandles] Severity: 5)
 
> 
> > afaict that's bogus. It's unnecessary that the code uses "our" instead of
> > "my", but there's no bareword there and replacing our with my fixes that
> > complaint.
> 
> Ah, I think I've got it.  Per "man perlfunc":
> 
>            "our" has the same scoping rules as "my" or "state", meaning that
>            it is only valid within a lexical scope.  Unlike "my" and "state",
>            which both declare new (lexical) variables, "our" only creates an
>            alias to an existing variable: a package variable of the same name.

> Since there's not actually any such package variable, the net effect is
> that the first argument of open() is an undeclared name.  I can more or
> less see why perl might treat that the same as a bareword, though I
> definitely agree that this error message is more confusing than helpful.

That'd make some sense - but it doesn't look like perlcritic is digging that
deep. Even if I make sure that the our references an existing variable in
package scope, perlcritic's complaint is the the same. I suspect perlcritic
simply has an exception for 'my' somewhere...

[digs a little]

Yep:

/usr/share/perl5/Perl/Critic/Policy/InputOutput/ProhibitBarewordFileHandles.pm
    if ( $first_token->isa('PPI::Token::Word') ) {
        if ( ($first_token ne 'my') && ($first_token !~ m/^STD(?:IN|OUT|ERR)$/xms ) ) {
            return $self->violation( $DESC, $EXPL, $elem );
        }



I'll push the obvious fix in a bit.


I find the references to some book, with an abbreviation that I certainly
didn't know and which seems ambiguous even in perl context, not helpful.
--verbose 10 makes the above warning instead be:

Bareword file handle opened at line 31, column 1.
  InputOutput::ProhibitBarewordFileHandles (Severity: 5)
    Using bareword symbols to refer to file handles is particularly evil
    because they are global, and you have no idea if that symbol already
    points to some other file handle. You can mitigate some of that risk by
    `local'izing the symbol first, but that's pretty ugly. Since Perl 5.6,
    you can use an undefined scalar variable as a lexical reference to an
    anonymous filehandle. Alternatively, see the IO::Handle or IO::File or
    FileHandle modules for an object-oriented approach.

        open FH, '<', $some_file;           #not ok
        open my $fh, '<', $some_file;       #ok
        my $fh = IO::File->new($some_file); #ok

    There are three exceptions: STDIN, STDOUT and STDERR. These three
    standard filehandles are always package variables.


Perhaps we should tweak the default format in our perlcriticrc?

Greetings,

Andres Freund



Re: pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> That'd make some sense - but it doesn't look like perlcritic is digging that
> deep.

Meh, that's the other explanation :-(

> I'll push the obvious fix in a bit.

Cool.  While looking at this, I wondered if parse.pl's handy-dandy
little exception

no warnings 'uninitialized';

might be related to the issue.  Evidently it's not, but now that
I've noticed it I definitely think it's not up to project standard.
I took it out and fixed the half-dozen places where I got warnings,
as attached.  Seem like a good fix?

            regards, tom lane

diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl
index a15f563ad4..38548f24e6 100644
--- a/src/interfaces/ecpg/preproc/parse.pl
+++ b/src/interfaces/ecpg/preproc/parse.pl
@@ -14,7 +14,6 @@

 use strict;
 use warnings;
-no warnings 'uninitialized';
 use Getopt::Long;

 my $srcdir  = '.';
@@ -40,7 +39,8 @@ my $tokenmode             = 0;

 my (%buff, $infield, $comment, %tokens, %addons);
 my ($stmt_mode, @fields);
-my ($line,      $non_term_id);
+my $line = '';
+my $non_term_id;


 # some token have to be replaced by other symbols
@@ -195,6 +195,16 @@ sub main
         # Now split the line into individual fields
         my @arr = split(' ');

+        if (!@arr)
+        {
+            # empty line: in tokenmode 1, emit an empty line, else ignore
+            if ($tokenmode == 1)
+            {
+                add_to_buffer('orig_tokens', '');
+            }
+            next line;
+        }
+
         if ($arr[0] eq '%token' && $tokenmode == 0)
         {
             $tokenmode = 1;
@@ -341,7 +351,8 @@ sub main

             # Are we looking at a declaration of a non-terminal ?
             if (($arr[$fieldIndexer] =~ /[A-Za-z0-9]+:/)
-                || $arr[ $fieldIndexer + 1 ] eq ':')
+                || (   $fieldIndexer + 1 < scalar(@arr)
+                    && $arr[ $fieldIndexer + 1 ] eq ':'))
             {
                 $non_term_id = $arr[$fieldIndexer];
                 $non_term_id =~ tr/://d;
@@ -409,11 +420,13 @@ sub main
             if (   $copymode
                 && !$prec
                 && !$comment
+                && $fieldIndexer < scalar(@arr)
                 && length($arr[$fieldIndexer])
                 && $infield)
             {
                 if ($arr[$fieldIndexer] ne 'Op'
-                    && (   $tokens{ $arr[$fieldIndexer] } > 0
+                    && ((   defined $tokens{ $arr[$fieldIndexer] }
+                            && $tokens{ $arr[$fieldIndexer] } > 0)
                         || $arr[$fieldIndexer] =~ /'.+'/)
                     || $stmt_mode == 1)
                 {
@@ -472,11 +485,12 @@ sub include_addon
     my $rec = $addons{$block};
     return 0 unless $rec;

-    if ($rec->{type} eq 'rule')
+    my $rectype = (defined $rec->{type}) ? $rec->{type} : '';
+    if ($rectype eq 'rule')
     {
         dump_fields($stmt_mode, $fields, ' { ');
     }
-    elsif ($rec->{type} eq 'addon')
+    elsif ($rectype eq 'addon')
     {
         add_to_buffer('rules', ' { ');
     }
@@ -487,7 +501,7 @@ sub include_addon

     push(@{ $buff{$buffer} }, @{ $rec->{lines} });

-    if ($rec->{type} eq 'addon')
+    if ($rectype eq 'addon')
     {
         dump_fields($stmt_mode, $fields, '');
     }

Re: pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

От
Andres Freund
Дата:
Hi,

On 2022-07-18 18:06:22 -0400, Tom Lane wrote:
> While looking at this, I wondered if parse.pl's handy-dandy little exception
> 
> no warnings 'uninitialized';
> 
> might be related to the issue.  Evidently it's not, but now that
> I've noticed it I definitely think it's not up to project standard.

Yea, that doesn't seem great.


> I took it out and fixed the half-dozen places where I got warnings,
> as attached.

Grepping for 'no warnings' shows that check_rules.pl has the same
issue. Perhaps worth fixing at the same time, given how closely related they
are?


> Seem like a good fix?

Looks good to my not-perl-trained eyes.

Might be worth making sure the output files are the same before / after? You
probably already checked...

Greetings,

Andres Freund



Re: pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2022-07-18 18:06:22 -0400, Tom Lane wrote:
>> I took it out and fixed the half-dozen places where I got warnings,
>> as attached.

> Grepping for 'no warnings' shows that check_rules.pl has the same
> issue. Perhaps worth fixing at the same time, given how closely related they
> are?

Ah, I didn't think to look around.  I'll see what I can do there.

>> Seem like a good fix?

> Looks good to my not-perl-trained eyes.

Yeah, I'm more worried about non-idiomatic usage than whether
it works or not.

> Might be worth making sure the output files are the same before / after? You
> probably already checked...

Yup, I did.

            regards, tom lane



Re: pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

От
Tom Lane
Дата:
I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> Grepping for 'no warnings' shows that check_rules.pl has the same
>> issue. Perhaps worth fixing at the same time, given how closely related they
>> are?

> Ah, I didn't think to look around.  I'll see what I can do there.

Oh, that one's only hiding *one* sloppy spot.  However, while testing
it I realized that the Makefile rule has two serious deficiencies:

* check_rules.pl not listed as a prereq, so nothing happens if you
change it and say "make".

* check_rules.pl should be run first.  As-is, if it complains,
preproc.y has already been updated and so another run will succeed.

You might wanna check your meson conversion for the same bugs.

            regards, tom lane

diff --git a/src/interfaces/ecpg/preproc/Makefile b/src/interfaces/ecpg/preproc/Makefile
index ec2359810e..3f33f4638f 100644
--- a/src/interfaces/ecpg/preproc/Makefile
+++ b/src/interfaces/ecpg/preproc/Makefile
@@ -64,9 +64,9 @@ preproc.h: preproc.c

 preproc.c: BISONFLAGS += -d

-preproc.y: ../../../backend/parser/gram.y parse.pl ecpg.addons ecpg.header ecpg.tokens ecpg.trailer ecpg.type
-    $(PERL) $(srcdir)/parse.pl --srcdir $(srcdir) --parser $< --output $@
+preproc.y: ../../../backend/parser/gram.y parse.pl check_rules.pl ecpg.addons ecpg.header ecpg.tokens ecpg.trailer
ecpg.type
     $(PERL) $(srcdir)/check_rules.pl --srcdir $(srcdir) --parser $<
+    $(PERL) $(srcdir)/parse.pl --srcdir $(srcdir) --parser $< --output $@

 # generate keyword headers
 c_kwlist_d.h: c_kwlist.h $(GEN_KEYWORDLIST_DEPS)
diff --git a/src/interfaces/ecpg/preproc/check_rules.pl b/src/interfaces/ecpg/preproc/check_rules.pl
index 58a755f454..23a3741993 100644
--- a/src/interfaces/ecpg/preproc/check_rules.pl
+++ b/src/interfaces/ecpg/preproc/check_rules.pl
@@ -18,7 +18,6 @@

 use strict;
 use warnings;
-no warnings 'uninitialized';
 use Getopt::Long;

 my $srcdir  = '.';
@@ -142,7 +141,8 @@ while (<$parser_fh>)
             $in_rule = 0 if $arr[$fieldIndexer] eq ';';
         }
         elsif (($arr[$fieldIndexer] =~ '[A-Za-z0-9]+:')
-            || $arr[ $fieldIndexer + 1 ] eq ':')
+               || (   $fieldIndexer + 1 < $n
+                      && $arr[ $fieldIndexer + 1 ] eq ':'))
         {
             die "unterminated rule at grammar line $.\n"
               if $in_rule;
diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl
index a15f563ad4..38548f24e6 100644
--- a/src/interfaces/ecpg/preproc/parse.pl
+++ b/src/interfaces/ecpg/preproc/parse.pl
@@ -14,7 +14,6 @@

 use strict;
 use warnings;
-no warnings 'uninitialized';
 use Getopt::Long;

 my $srcdir  = '.';
@@ -40,7 +39,8 @@ my $tokenmode             = 0;

 my (%buff, $infield, $comment, %tokens, %addons);
 my ($stmt_mode, @fields);
-my ($line,      $non_term_id);
+my $line = '';
+my $non_term_id;


 # some token have to be replaced by other symbols
@@ -195,6 +195,16 @@ sub main
         # Now split the line into individual fields
         my @arr = split(' ');

+        if (!@arr)
+        {
+            # empty line: in tokenmode 1, emit an empty line, else ignore
+            if ($tokenmode == 1)
+            {
+                add_to_buffer('orig_tokens', '');
+            }
+            next line;
+        }
+
         if ($arr[0] eq '%token' && $tokenmode == 0)
         {
             $tokenmode = 1;
@@ -341,7 +351,8 @@ sub main

             # Are we looking at a declaration of a non-terminal ?
             if (($arr[$fieldIndexer] =~ /[A-Za-z0-9]+:/)
-                || $arr[ $fieldIndexer + 1 ] eq ':')
+                || (   $fieldIndexer + 1 < scalar(@arr)
+                    && $arr[ $fieldIndexer + 1 ] eq ':'))
             {
                 $non_term_id = $arr[$fieldIndexer];
                 $non_term_id =~ tr/://d;
@@ -409,11 +420,13 @@ sub main
             if (   $copymode
                 && !$prec
                 && !$comment
+                && $fieldIndexer < scalar(@arr)
                 && length($arr[$fieldIndexer])
                 && $infield)
             {
                 if ($arr[$fieldIndexer] ne 'Op'
-                    && (   $tokens{ $arr[$fieldIndexer] } > 0
+                    && ((   defined $tokens{ $arr[$fieldIndexer] }
+                            && $tokens{ $arr[$fieldIndexer] } > 0)
                         || $arr[$fieldIndexer] =~ /'.+'/)
                     || $stmt_mode == 1)
                 {
@@ -472,11 +485,12 @@ sub include_addon
     my $rec = $addons{$block};
     return 0 unless $rec;

-    if ($rec->{type} eq 'rule')
+    my $rectype = (defined $rec->{type}) ? $rec->{type} : '';
+    if ($rectype eq 'rule')
     {
         dump_fields($stmt_mode, $fields, ' { ');
     }
-    elsif ($rec->{type} eq 'addon')
+    elsif ($rectype eq 'addon')
     {
         add_to_buffer('rules', ' { ');
     }
@@ -487,7 +501,7 @@ sub include_addon

     push(@{ $buff{$buffer} }, @{ $rec->{lines} });

-    if ($rec->{type} eq 'addon')
+    if ($rectype eq 'addon')
     {
         dump_fields($stmt_mode, $fields, '');
     }

Re: pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

От
Andres Freund
Дата:
Hi,

On 2022-07-18 18:36:50 -0400, Tom Lane wrote:
> I wrote:
> > Andres Freund <andres@anarazel.de> writes:
> >> Grepping for 'no warnings' shows that check_rules.pl has the same
> >> issue. Perhaps worth fixing at the same time, given how closely related they
> >> are?
> 
> > Ah, I didn't think to look around.  I'll see what I can do there.
> 
> Oh, that one's only hiding *one* sloppy spot.  However, while testing
> it I realized that the Makefile rule has two serious deficiencies:
> 
> * check_rules.pl not listed as a prereq, so nothing happens if you
> change it and say "make".
> 
> * check_rules.pl should be run first.  As-is, if it complains,
> preproc.y has already been updated and so another run will succeed.

Good catches.


> You might wanna check your meson conversion for the same bugs.

Don't think it has. check_rules.pl and parse.pl are run by separate build
rules, with a stamp file marking the success of check_rules.pl. However, they
can currently run concurrently, with the preproc.c generation depending on
both.

I guess it'd be a tad cleaner if the dependency instead were to run
check_rules.pl before parse.pl, the degree of parallelism probably doesn't
matter here.

Both rules depend on the scripts.

Greetings,

Andres Freund



Re: pgsql: ecpg: Output dir, source dir, stamp file argument for preproc/*.

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2022-07-18 18:36:50 -0400, Tom Lane wrote:
> Don't think it has. check_rules.pl and parse.pl are run by separate build
> rules, with a stamp file marking the success of check_rules.pl. However, they
> can currently run concurrently, with the preproc.c generation depending on
> both.

That sounds fine, probably no need to change it.

            regards, tom lane