Обсуждение: More pgindent follies

Поиск
Список
Период
Сортировка

More pgindent follies

От
Tom Lane
Дата:
As long as you're fixing bugs in pgindent, here are some more follies
exhibited by the most recent pgindent run.  Some of these bugs have
been there for awhile, but at least one (the removal of space before
a same-line comment) seems to be new as of the most recent run.

The examples are all taken from 

$ cd .../src/backend/utils/adt/
$ cvs diff -c -r1.85 -r1.86 selfuncs.c

but there are many similar cases elsewhere.

1. I can sort of understand inserting space before an #ifdef (though I do
not agree with it), but why before #endif?  Why does the first example
insert a blank line *only* before the #endif and not its matching #if?
If you're trying to set off the #if block from surrounding code, seems
like a blank line *after* the #endif would help, not one before.  But
none of the here-inserted blank lines seem to me to improve readability;
I'd vote for not making any such changes.

***************
*** 648,653 ****
--- 653,659 ---- { #ifdef NOT_USED                    /* see neqjoinsel() before removing me! */     Oid
opid= PG_GETARG_OID(0);
 
+  #endif     Oid            relid1 = PG_GETARG_OID(1);     AttrNumber    attno1 = PG_GETARG_INT16(2);
***************
*** 1078,1087 ****
--- 1091,1102 ---- convert_string_datum(Datum value, Oid typid) {     char       *val;
+  #ifdef USE_LOCALE     char       *xfrmstr;     size_t        xfrmsize;     size_t        xfrmlen;
+  #endif      switch (typid)
***************

2. Here are two examples of a long-standing bug: pgindent frequently
(but not always) mis-indents the first 'case' line(s) of a switch.
I don't agree with its indentation of block comments between case
entries, either.

***************
*** 845,892 ****     switch (valuetypid)     { 
!         /*
!          * Built-in numeric types
!          */
!         case BOOLOID:
!         case INT2OID:
!         case INT4OID:
!         case INT8OID:
!         case FLOAT4OID:
!         case FLOAT8OID:
!         case NUMERICOID:
!         case OIDOID:
!         case REGPROCOID:             *scaledvalue = convert_numeric_to_scalar(value, valuetypid);
*scaledlobound= convert_numeric_to_scalar(lobound, boundstypid);             *scaledhibound =
convert_numeric_to_scalar(hibound,boundstypid);             return true; 
 
!         /*
!          * Built-in string types
!          */         case CHAROID:         case BPCHAROID:         case VARCHAROID:
--- 851,898 ----     switch (valuetypid)     { 
!             /*
!              * Built-in numeric types
!              */
!             case BOOLOID:
!             case INT2OID:
!             case INT4OID:
!             case INT8OID:
!             case FLOAT4OID:
!             case FLOAT8OID:
!             case NUMERICOID:
!             case OIDOID:
!             case REGPROCOID:             *scaledvalue = convert_numeric_to_scalar(value, valuetypid);
*scaledlobound= convert_numeric_to_scalar(lobound, boundstypid);             *scaledhibound =
convert_numeric_to_scalar(hibound,boundstypid);             return true; 
 
!             /*
!              * Built-in string types
!              */         case CHAROID:         case BPCHAROID:         case VARCHAROID:
***************
*** 911,917 **** {     switch (typid)     {
!         case BOOLOID:             return (double) DatumGetBool(value);         case INT2OID:             return
(double)DatumGetInt16(value);
 
--- 917,923 ---- {     switch (typid)     {
!             case BOOLOID:             return (double) DatumGetBool(value);         case INT2OID:             return
(double)DatumGetInt16(value);
 
***************

3. This is new misbehavior as of the last pgindent run: whitespace
between a statement and a comment on the same line is sometimes removed.

***************
*** 1120,1126 ****  #ifdef USE_LOCALE     /* Guess that transformed string is not much bigger than original */
!     xfrmsize = strlen(val) + 32;        /* arbitrary pad value here... */     xfrmstr = (char *) palloc(xfrmsize);
xfrmlen = strxfrm(xfrmstr, val, xfrmsize);     if (xfrmlen >= xfrmsize)
 
--- 1137,1143 ----  #ifdef USE_LOCALE     /* Guess that transformed string is not much bigger than original */
!     xfrmsize = strlen(val) + 32;/* arbitrary pad value here... */     xfrmstr = (char *) palloc(xfrmsize);
xfrmlen= strxfrm(xfrmstr, val, xfrmsize);     if (xfrmlen >= xfrmsize)
 
***************

4. This breaking of a comment attached to a #define scares me.
Apparently gcc still thinks the comment is valid, but it seems to me
that this is making not-very-portable assumptions about the behavior of
the preprocessor.  I always thought that you needed to use backslashes
to continue a preprocessor command onto the next line reliably.

***************
*** 1691,1705 ****  #define FIXED_CHAR_SEL    0.04    /* about 1/25 */ #define CHAR_RANGE_SEL    0.25
! #define ANY_CHAR_SEL    0.9        /* not 1, since it won't match end-of-string */ #define FULL_WILDCARD_SEL 5.0
#definePARTIAL_WILDCARD_SEL 2.0 
 
--- 1718,1733 ----  #define FIXED_CHAR_SEL    0.04    /* about 1/25 */ #define CHAR_RANGE_SEL    0.25
! #define ANY_CHAR_SEL    0.9        /* not 1, since it won't match
!                                  * end-of-string */ #define FULL_WILDCARD_SEL 5.0 #define PARTIAL_WILDCARD_SEL 2.0 
***************

5. Here's an interesting one: why was the "return" line misindented?
I don't particularly agree with the handling of the comments on the
#else and #endif lines either... they're not even mutually consistent,
let alone stylistically good.

***************
*** 1904,1912 ****     else         result = false;     return (bool) result;
! #else /* not USE_LOCALE */
!     return true;                /* We must be in C locale, which is OK */
! #endif /* USE_LOCALE */ }  /*
--- 1935,1943 ----     else         result = false;     return (bool) result;
! #else                            /* not USE_LOCALE */
!                 return true;    /* We must be in C locale, which is OK */
! #endif     /* USE_LOCALE */ }  /*
***************
        regards, tom lane


Re: More pgindent follies

От
Bruce Momjian
Дата:
OK, I have fixes for all of these.  BSD indent clearly has some bugs,
and ironically, the indent source code is quite ugly.  The best way to
work around them is with pre/post processing, which we already do.

I just checked FreeBSD and NetBSD and they don't seem to have added too
much to BSD indent it since it left Berkeley.  They have upped the
keyword limit to 1000, but that is still too small for us, and I don't
see any other major work being done.

Looks like GNU indent development was picked up in 1999 again, so it may
be worth another look:

-r--r--r--   1 34       21        160411 Feb  3  1994 indent-1.9.1.tar.gz
-r--r--r--   1 34       21        143307 May 27  1999 indent-1.10.0.tar.gz
-r--r--r--   1 34       21        183160 Jul  4  1999 indent-2.1.0.tar.gz
-r--r--r--   1 34       21        183999 Jul 15  1999 indent-2.1.1.tar.gz
-r--r--r--   1 34       21        186287 Jul 24  1999 indent-2.2.0.tar.gz
-r--r--r--   1 34       21        191303 Sep 25  1999 indent-2.2.1.tar.gz
-r--r--r--   1 34       21        192872 Sep 28  1999 indent-2.2.2.tar.gz
-r--r--r--   1 34       21        198319 Oct  1  1999 indent-2.2.3.tar.gz
-r--r--r--   1 34       21        197332 Nov  4  1999 indent-2.2.4.tar.gz
-r--r--r--   1 34       21        203764 Jan 16  2000 indent-2.2.5.tar.gz
-r--r--r--   1 34       21        222510 Nov 17  2000 indent-2.2.6.tar.gz

OK, here are my comments.


> As long as you're fixing bugs in pgindent, here are some more follies
> exhibited by the most recent pgindent run.  Some of these bugs have
> been there for awhile, but at least one (the removal of space before
> a same-line comment) seems to be new as of the most recent run.
> 

Glad to fix these.  I hacked together pgindent just to match what I
thought looked good, so if people have things that look bad, let me
know.


> The examples are all taken from 
> 
> $ cd .../src/backend/utils/adt/
> $ cvs diff -c -r1.85 -r1.86 selfuncs.c

Great that you found them all in one file, or bad that they all existed
in one file.  :-)


> but there are many similar cases elsewhere.
> 
> 1. I can sort of understand inserting space before an #ifdef (though I do
> not agree with it), but why before #endif?  Why does the first example
> insert a blank line *only* before the #endif and not its matching #if?
> If you're trying to set off the #if block from surrounding code, seems
> like a blank line *after* the #endif would help, not one before.  But
> none of the here-inserted blank lines seem to me to improve readability;
> I'd vote for not making any such changes.
> 
> ***************
> *** 648,653 ****
> --- 653,659 ----
>   {
>   #ifdef NOT_USED                    /* see neqjoinsel() before removing me! */
>       Oid            opid = PG_GETARG_OID(0);
> + 
>   #endif
>       Oid            relid1 = PG_GETARG_OID(1);
>       AttrNumber    attno1 = PG_GETARG_INT16(2);
> ***************
> *** 1078,1087 ****
> --- 1091,1102 ----
>   convert_string_datum(Datum value, Oid typid)
>   {
>       char       *val;
> + 
>   #ifdef USE_LOCALE
>       char       *xfrmstr;
>       size_t        xfrmsize;
>       size_t        xfrmlen;
> + 
>   #endif
>   
>       switch (typid)
> ***************

Fixed with 'awk' script to remove blank line above #endif.

> 
> 2. Here are two examples of a long-standing bug: pgindent frequently
> (but not always) mis-indents the first 'case' line(s) of a switch.
> I don't agree with its indentation of block comments between case
> entries, either.
> 
> ***************
> *** 845,892 ****
>       switch (valuetypid)
>       {
>   
> !         /*
> !          * Built-in numeric types
> !          */
> !         case BOOLOID:
> !         case INT2OID:
> !         case INT4OID:
> !         case INT8OID:
> !         case FLOAT4OID:
> !         case FLOAT8OID:
> !         case NUMERICOID:
> !         case OIDOID:
> !         case REGPROCOID:
>               *scaledvalue = convert_numeric_to_scalar(value, valuetypid);
>               *scaledlobound = convert_numeric_to_scalar(lobound, boundstypid);
>               *scaledhibound = convert_numeric_to_scalar(hibound, boundstypid);
>               return true;
>   
> !         /*
> !          * Built-in string types
> !          */
>           case CHAROID:
>           case BPCHAROID:
>           case VARCHAROID:
> --- 851,898 ----
>       switch (valuetypid)
>       {
>   
> !             /*
> !              * Built-in numeric types
> !              */
> !             case BOOLOID:
> !             case INT2OID:
> !             case INT4OID:
> !             case INT8OID:
> !             case FLOAT4OID:
> !             case FLOAT8OID:
> !             case NUMERICOID:
> !             case OIDOID:
> !             case REGPROCOID:
>               *scaledvalue = convert_numeric_to_scalar(value, valuetypid);
>               *scaledlobound = convert_numeric_to_scalar(lobound, boundstypid);
>               *scaledhibound = convert_numeric_to_scalar(hibound, boundstypid);
>               return true;
>   
> !             /*
> !              * Built-in string types
> !              */
>           case CHAROID:
>           case BPCHAROID:
>           case VARCHAROID:
> ***************
> *** 911,917 ****
>   {
>       switch (typid)
>       {
> !         case BOOLOID:
>               return (double) DatumGetBool(value);
>           case INT2OID:
>               return (double) DatumGetInt16(value);
> --- 917,923 ----
>   {
>       switch (typid)
>       {
> !             case BOOLOID:
>               return (double) DatumGetBool(value);
>           case INT2OID:
>               return (double) DatumGetInt16(value);
> ***************

This is actually caused by functions that do not define local variables
but start with switch:

intfunc(int x){    switch (x)    {        case 1:        case 2:

in these cases, case 1 and case2 get indented too far.  They actually
get indented as though they were variable names, which is clearly wrong.
BSD indent has a bug in such cases, so the trick was to add:
int pgindent_func_no_var_fix;

before indent is run, and after remove it and an optional blank line if
it was the only variable definition.

> 
> 3. This is new misbehavior as of the last pgindent run: whitespace
> between a statement and a comment on the same line is sometimes removed.
> 
> ***************
> *** 1120,1126 ****
>   
>   #ifdef USE_LOCALE
>       /* Guess that transformed string is not much bigger than original */
> !     xfrmsize = strlen(val) + 32;        /* arbitrary pad value here... */
>       xfrmstr = (char *) palloc(xfrmsize);
>       xfrmlen = strxfrm(xfrmstr, val, xfrmsize);
>       if (xfrmlen >= xfrmsize)
> --- 1137,1143 ----
>   
>   #ifdef USE_LOCALE
>       /* Guess that transformed string is not much bigger than original */
> !     xfrmsize = strlen(val) + 32;/* arbitrary pad value here... */
>       xfrmstr = (char *) palloc(xfrmsize);
>       xfrmlen = strxfrm(xfrmstr, val, xfrmsize);
>       if (xfrmlen >= xfrmsize)
> ***************


This is happening because it has landed on a tab stop and isn't adding
another one.  I have added a 'sed' line to properly push these to the
next tab stop.

> 
> 4. This breaking of a comment attached to a #define scares me.
> Apparently gcc still thinks the comment is valid, but it seems to me
> that this is making not-very-portable assumptions about the behavior of
> the preprocessor.  I always thought that you needed to use backslashes
> to continue a preprocessor command onto the next line reliably.
> 
> ***************
> *** 1691,1705 ****
>   
>   #define FIXED_CHAR_SEL    0.04    /* about 1/25 */
>   #define CHAR_RANGE_SEL    0.25
> ! #define ANY_CHAR_SEL    0.9        /* not 1, since it won't match end-of-string */
>   #define FULL_WILDCARD_SEL 5.0
>   #define PARTIAL_WILDCARD_SEL 2.0
>   
> --- 1718,1733 ----
>   
>   #define FIXED_CHAR_SEL    0.04    /* about 1/25 */
>   #define CHAR_RANGE_SEL    0.25
> ! #define ANY_CHAR_SEL    0.9        /* not 1, since it won't match
> !                                  * end-of-string */
>   #define FULL_WILDCARD_SEL 5.0
>   #define PARTIAL_WILDCARD_SEL 2.0
>   
> ***************

I don't see the problem here.  My assumption is that the comment is not
part of the define, right?


> 
> 5. Here's an interesting one: why was the "return" line misindented?
> I don't particularly agree with the handling of the comments on the
> #else and #endif lines either... they're not even mutually consistent,
> let alone stylistically good.
> 
> ***************
> *** 1904,1912 ****
>       else
>           result = false;
>       return (bool) result;
> ! #else /* not USE_LOCALE */
> !     return true;                /* We must be in C locale, which is OK */
> ! #endif /* USE_LOCALE */
>   }
>   
>   /*
> --- 1935,1943 ----
>       else
>           result = false;
>       return (bool) result;
> ! #else                            /* not USE_LOCALE */
> !                 return true;    /* We must be in C locale, which is OK */
> ! #endif     /* USE_LOCALE */
>   }
>   
>   /*
> ***************

Same cause as #2 (switch/case indent).  Fixed.

I will cvs commit the pgindent changes, and send a diff of selfuncs.c to
patches so people can see the fixes.   Fixing the entire tree will have
to wait for 7.2 beta.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: More pgindent follies

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> 4. This breaking of a comment attached to a #define scares me.
>> 
>> ***************
>> *** 1691,1705 ****
>> 
>> #define FIXED_CHAR_SEL    0.04    /* about 1/25 */
>> #define CHAR_RANGE_SEL    0.25
>> ! #define ANY_CHAR_SEL    0.9        /* not 1, since it won't match end-of-string */
>> #define FULL_WILDCARD_SEL 5.0
>> #define PARTIAL_WILDCARD_SEL 2.0
>> 
>> --- 1718,1733 ----
>> 
>> #define FIXED_CHAR_SEL    0.04    /* about 1/25 */
>> #define CHAR_RANGE_SEL    0.25
>> ! #define ANY_CHAR_SEL    0.9        /* not 1, since it won't match
>> !                                  * end-of-string */
>> #define FULL_WILDCARD_SEL 5.0
>> #define PARTIAL_WILDCARD_SEL 2.0
>> 
>> ***************

> I don't see the problem here.  My assumption is that the comment is not
> part of the define, right?

Well, that's the question.  ANSI C requires comments to be replaced by
whitespace before preprocessor commands are detected/executed, but there
was an awful lot of variation in preprocessor behavior before ANSI.
I suspect there are still preprocessors out there that might misbehave
on this input --- for example, by leaving the text "* end-of-string */"
present in the preprocessor output.  Now we still go to considerable
lengths to support not-quite-ANSI preprocessors.  I don't like the idea
that all the work done by configure and c.h in that direction might be
wasted because of pgindent carelessness.
        regards, tom lane


Re: More pgindent follies

От
Bruce Momjian
Дата:
> > I don't see the problem here.  My assumption is that the comment is not
> > part of the define, right?
> 
> Well, that's the question.  ANSI C requires comments to be replaced by
> whitespace before preprocessor commands are detected/executed, but there
> was an awful lot of variation in preprocessor behavior before ANSI.
> I suspect there are still preprocessors out there that might misbehave
> on this input --- for example, by leaving the text "* end-of-string */"
> present in the preprocessor output.  Now we still go to considerable
> lengths to support not-quite-ANSI preprocessors.  I don't like the idea
> that all the work done by configure and c.h in that direction might be
> wasted because of pgindent carelessness.

I agree, but in a certain sense, we would have found those compilers
already.  This is not new behavour as far as I know, and clearly this
would throw a compiler error.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: More pgindent follies

От
Tom Lane
Дата:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I agree, but in a certain sense, we would have found those compilers
> already.  This is not new behavour as far as I know, and clearly this
> would throw a compiler error.

[ Digs around ... ]  OK, you're right, it's not new behavior; we have
instances of similarly-split comments at least back to REL6_5.  I guess
my fears of portability problems are unfounded.
        regards, tom lane


Re: More pgindent follies

От
ncm@zembu.com (Nathan Myers)
Дата:
On Wed, May 23, 2001 at 11:58:51AM -0400, Bruce Momjian wrote:
> > > I don't see the problem here.  My assumption is that the comment is not
> > > part of the define, right?
> > 
> > Well, that's the question.  ANSI C requires comments to be replaced by
> > whitespace before preprocessor commands are detected/executed, but there
> > was an awful lot of variation in preprocessor behavior before ANSI.
> > I suspect there are still preprocessors out there that might misbehave
> > on this input --- for example, by leaving the text "* end-of-string */"
> > present in the preprocessor output.  Now we still go to considerable
> > lengths to support not-quite-ANSI preprocessors.  I don't like the idea
> > that all the work done by configure and c.h in that direction might be
> > wasted because of pgindent carelessness.
> 
> I agree, but in a certain sense, we would have found those compilers
> already.  This is not new behavour as far as I know, and clearly this
> would throw a compiler error.

This is good news!

Maybe this process can be formalized.  That is, each official release 
migh contain a source file with various "modern" constructs which we 
suspect might break old compilers.

A comment block at the top requests that any breakage be reported.

A configure option would allow a user to avoid compiling it, and a
comment in the file would explain how to use the option.  After a
major release, any modern construct that caused no trouble in the 
last release is considered OK to use.

This process makes it easy to leave behind obsolete language 
restrictions: if you wonder if it's OK now to use a feature that once 
broke some crufty platform, drop it in modern.c and forget about it.  
After the next release, you know the answer.

Nathan Myers
ncm@zembu.com


Re: More pgindent follies

От
Tom Lane
Дата:
ncm@zembu.com (Nathan Myers) writes:
> This is good news!

> Maybe this process can be formalized.  That is, each official release 
> migh contain a source file with various "modern" constructs which we 
> suspect might break old compilers.

I have no objection to this, if the process *is* formalized --- that
is, we explicitly know and agree to probing for certain obsolescent
constructs in each release.  The thing that bothered me about this
was that pgindent was pushing the envelope without any explicit
discussion or advance knowledge.

There's plenty of historical cruft in PG that I'd be glad to get rid
of, if we can satisfy ourselves that it's no longer needed for any
platform of interest.  It's "stealth" obsolescence checks that bother
me ;-)

> After a major release, any modern construct that caused no trouble in
> the last release is considered OK to use.

Probably need to allow a couple major releases, considering that we
see lots of people migrating from not-the-last release.  But that's a
quibble.  My point is we need an explicit debate about the risks and
benefits of each change.  Finding out two years later that a broken tool
was doing the experiment without our knowledge is not cool.
        regards, tom lane


Re: More pgindent follies

От
Bruce Momjian
Дата:
> I have no objection to this, if the process *is* formalized --- that
> is, we explicitly know and agree to probing for certain obsolescent
> constructs in each release.  The thing that bothered me about this
> was that pgindent was pushing the envelope without any explicit
> discussion or advance knowledge.

Actually, as can be seen from the recent use of readdir() by WAL, often
new features/constructs get in when someone adds them and they get
through release testing.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026