Обсуждение: Tru64/Alpha problems

Поиск

Список

Период

Сортировка

Tru64/Alpha problems

От

Andrew Dunstan

Дата:

28 марта 2006 г., 12:58:52

Honda Shigehiro has diagnosed the longstanding problems with his
Tru64/Alpha buildfarm member (bear). See below.

First, it appears that there is a problem with the system getaddrinfo(),
which configure reports as usable, but turns out not to be. Our current
configure test checks the return value of getaddrinfo("", "", NULL,
NULL) but I am wondering if we should test for "localhost" instead of ""
as the first parameter.

Second, it appears that this platform apparently doesn't handle Infinity
and NaN well. The regression diffs are attached.

cheers

andrew

-------- Original Message --------
Subject:     Re: postgresql buildfarm member bear
Date:     Tue, 28 Mar 2006 21:53:15 +0900 (JST)
From:     Honda Shigehiro <fwif0083@mb.infoweb.ne.jp>
To:     andrew@dunslane.net
CC:     fwif0083@mb.infoweb.ne.jp
References:     <44229B69.2090909@dunslane.net>
<20060323.225736.41630581.fwif0083@mb.infoweb.ne.jp>
<4422ACED.7030506@dunslane.net>




I found the cause. Tru64's getaddrinfo seems something wrong.
(I use version 5.0, but with google search, this is same until
version 5.1B.) I had used only with Unix domain socket.

So I succeed to start server with Unix Domain Socket(ex. make check).
But with "listen_addresses = 'localhost'", fail with:
  LOG:  could not translate host name "localhost", service "5432" to address: servname not supported for ai_socktype

To solve this, I had change to use src/port/getaddrinfo.c.
(I have little knowledge about autoconf...so ugly...)
Is there smart way which do not need to change code?

(1) change configure script and run it
bash-2.05b$ diff configure.aaa configure
14651c14651
< #define HAVE_GETADDRINFO 1
---
> /* #define HAVE_GETADDRINFO 1 */

(2) run make command
It fail by some undefined symbol. After the fail, change directory
to src/port and type:
cc -std   -I../../src/port  -I../../src/include -I/usr/local/include -c getaddrinfo.c -o getaddrinfo.o
ar crs libpgport.a isinf.o getopt_long.o copydir.o dirmod.o exec.o noblock.o path.o pipe.o pgsleep.o pgstrcasecmp.o
sprompt.othread.o getaddrinfo.o 
ar crs libpgport_srv.a isinf.o getopt_long.o copydir.o dirmod_srv.o exec_srv.o noblock.o path.o pipe.o pgsleep.o
pgstrcasecmp.osprompt.o thread_srv.o getaddrinfo.o 

(3) re-run make command

(4) check make check and make installcheck
float4 and float8 tests are failed in both cases.




*** ./expected/float4.out    Thu Apr  7 10:51:40 2005
--- ./results/float4.out    Tue Mar 28 21:03:10 2006
***************
*** 35,69 ****
  ERROR:  invalid input syntax for type real: "123            5"
  -- special inputs
  SELECT 'NaN'::float4;
!  float4
! --------
!     NaN
! (1 row)
!
  SELECT 'nan'::float4;
!  float4
! --------
!     NaN
! (1 row)
!
  SELECT '   NAN  '::float4;
!  float4
! --------
!     NaN
! (1 row)
!
  SELECT 'infinity'::float4;
!   float4
! ----------
!  Infinity
! (1 row)
!
  SELECT '          -INFINiTY   '::float4;
!   float4
! -----------
!  -Infinity
! (1 row)
!
  -- bad special inputs
  SELECT 'N A N'::float4;
  ERROR:  invalid input syntax for type real: "N A N"
--- 35,54 ----
  ERROR:  invalid input syntax for type real: "123            5"
  -- special inputs
  SELECT 'NaN'::float4;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT 'nan'::float4;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT '   NAN  '::float4;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT 'infinity'::float4;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT '          -INFINiTY   '::float4;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  -- bad special inputs
  SELECT 'N A N'::float4;
  ERROR:  invalid input syntax for type real: "N A N"
***************
*** 72,90 ****
  SELECT ' INFINITY    x'::float4;
  ERROR:  invalid input syntax for type real: " INFINITY    x"
  SELECT 'Infinity'::float4 + 100.0;
! ERROR:  type "double precision" value out of range: overflow
  SELECT 'Infinity'::float4 / 'Infinity'::float4;
!  ?column?
! ----------
!       NaN
! (1 row)
!
  SELECT 'nan'::float4 / 'nan'::float4;
!  ?column?
! ----------
!       NaN
! (1 row)
!
  SELECT '' AS five, * FROM FLOAT4_TBL;
   five |     f1
  ------+-------------
--- 57,70 ----
  SELECT ' INFINITY    x'::float4;
  ERROR:  invalid input syntax for type real: " INFINITY    x"
  SELECT 'Infinity'::float4 + 100.0;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT 'Infinity'::float4 / 'Infinity'::float4;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT 'nan'::float4 / 'nan'::float4;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT '' AS five, * FROM FLOAT4_TBL;
   five |     f1
  ------+-------------

======================================================================

*** ./expected/float8.out    Thu Jun  9 06:15:29 2005
--- ./results/float8.out    Tue Mar 28 21:03:10 2006
***************
*** 35,57 ****
  ERROR:  invalid input syntax for type double precision: "123           5"
  -- special inputs
  SELECT 'NaN'::float8;
!  float8
! --------
!     NaN
! (1 row)
!
  SELECT 'nan'::float8;
!  float8
! --------
!     NaN
! (1 row)
!
  SELECT '   NAN  '::float8;
!  float8
! --------
!     NaN
! (1 row)
!
  SELECT 'infinity'::float8;
    float8
  ----------
--- 35,48 ----
  ERROR:  invalid input syntax for type double precision: "123           5"
  -- special inputs
  SELECT 'NaN'::float8;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT 'nan'::float8;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT '   NAN  '::float8;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT 'infinity'::float8;
    float8
  ----------
***************
*** 72,90 ****
  SELECT ' INFINITY    x'::float8;
  ERROR:  invalid input syntax for type double precision: " INFINITY    x"
  SELECT 'Infinity'::float8 + 100.0;
! ERROR:  type "double precision" value out of range: overflow
  SELECT 'Infinity'::float8 / 'Infinity'::float8;
!  ?column?
! ----------
!       NaN
! (1 row)
!
  SELECT 'nan'::float8 / 'nan'::float8;
!  ?column?
! ----------
!       NaN
! (1 row)
!
  SELECT '' AS five, * FROM FLOAT8_TBL;
   five |          f1
  ------+----------------------
--- 63,76 ----
  SELECT ' INFINITY    x'::float8;
  ERROR:  invalid input syntax for type double precision: " INFINITY    x"
  SELECT 'Infinity'::float8 + 100.0;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT 'Infinity'::float8 / 'Infinity'::float8;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT 'nan'::float8 / 'nan'::float8;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT '' AS five, * FROM FLOAT8_TBL;
   five |          f1
  ------+----------------------
***************
*** 342,348 ****
     SET f1 = FLOAT8_TBL.f1 * '-1'
     WHERE FLOAT8_TBL.f1 > '0.0';
  SELECT '' AS bad, f.f1 * '1e200' from FLOAT8_TBL f;
! ERROR:  type "double precision" value out of range: overflow
  SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f;
  ERROR:  result is out of range
  SELECT '' AS bad, ln(f.f1) from FLOAT8_TBL f where f.f1 = '0.0' ;
--- 328,335 ----
     SET f1 = FLOAT8_TBL.f1 * '-1'
     WHERE FLOAT8_TBL.f1 > '0.0';
  SELECT '' AS bad, f.f1 * '1e200' from FLOAT8_TBL f;
! ERROR:  floating-point exception
! DETAIL:  An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid
operation,such as division by zero. 
  SELECT '' AS bad, f.f1 ^ '1e200' from FLOAT8_TBL f;
  ERROR:  result is out of range
  SELECT '' AS bad, ln(f.f1) from FLOAT8_TBL f where f.f1 = '0.0' ;

======================================================================

Re: Tru64/Alpha problems

От

Tom Lane

Дата:

28 марта 2006 г., 14:24:54

Andrew Dunstan <andrew@dunslane.net> writes:
> Honda Shigehiro has diagnosed the longstanding problems with his 
> Tru64/Alpha buildfarm member (bear). See below.

> First, it appears that there is a problem with the system getaddrinfo(), 
> which configure reports as usable, but turns out not to be. Our current 
> configure test checks the return value of getaddrinfo("", "", NULL, 
> NULL) but I am wondering if we should test for "localhost" instead of "" 
> as the first parameter.

Huh?  That's just an AC_TRY_LINK test, we don't actually execute it.
If we did, the test would fail on machines where resolution of "localhost"
is broken, which we already know is a not-so-rare disease ...

I'm not sure that I believe the "getaddrinfo doesn't work" diagnosis
anyway, seeing that bear gets through "make check" okay.  Wouldn't that
fail too if there were a problem there?

> Second, it appears that this platform apparently doesn't handle Infinity 
> and NaN well. The regression diffs are attached.

On the FPE front, it'd be useful to get a gdb traceback to see where the
SIGFPE is occurring.
        regards, tom lane

Re: Tru64/Alpha problems

От

Andrew Dunstan

Дата:

28 марта 2006 г., 14:53:31

Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>Honda Shigehiro has diagnosed the longstanding problems with his 
>>Tru64/Alpha buildfarm member (bear). See below.
>>    
>>
>
>  
>
>>First, it appears that there is a problem with the system getaddrinfo(), 
>>which configure reports as usable, but turns out not to be. Our current 
>>configure test checks the return value of getaddrinfo("", "", NULL, 
>>NULL) but I am wondering if we should test for "localhost" instead of "" 
>>as the first parameter.
>>    
>>
>
>Huh?  That's just an AC_TRY_LINK test, we don't actually execute it.
>If we did, the test would fail on machines where resolution of "localhost"
>is broken, which we already know is a not-so-rare disease ...
>
>I'm not sure that I believe the "getaddrinfo doesn't work" diagnosis
>anyway, seeing that bear gets through "make check" okay.  Wouldn't that
>fail too if there were a problem there?
>
>  
>

Now that I look further into it, this machine was working just fine 
until we made a change in configure, allegedly to get things right on 
Tru64. The first build that went wrong was the one right after 
configure.in version 1.450. I see a report from Albert Chin that this 
patch worked, but the buildfarm member seems to provide counter-proof.


cheers

andrew

Re: Tru64/Alpha problems

От

Tom Lane

Дата:

28 марта 2006 г., 15:02:56

Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> I'm not sure that I believe the "getaddrinfo doesn't work" diagnosis
>> anyway, seeing that bear gets through "make check" okay.  Wouldn't that
>> fail too if there were a problem there?

> Now that I look further into it, this machine was working just fine 
> until we made a change in configure, allegedly to get things right on 
> Tru64. The first build that went wrong was the one right after 
> configure.in version 1.450. I see a report from Albert Chin that this 
> patch worked, but the buildfarm member seems to provide counter-proof.

Ugh.  So probably it depends on just which version of Tru64 you're using
:-(.  Maybe earlier versions of Tru64 have a broken getaddrinfo and it's
fixed in later ones?  How would we tell the difference?
        regards, tom lane

Re: Tru64/Alpha problems

От

Andrew Dunstan

Дата:

30 марта 2006 г., 21:17:48

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Tom Lane wrote:
>>     
>>> I'm not sure that I believe the "getaddrinfo doesn't work" diagnosis
>>> anyway, seeing that bear gets through "make check" okay.  Wouldn't that
>>> fail too if there were a problem there?
>>>       
>
>   
>> Now that I look further into it, this machine was working just fine 
>> until we made a change in configure, allegedly to get things right on 
>> Tru64. The first build that went wrong was the one right after 
>> configure.in version 1.450. I see a report from Albert Chin that this 
>> patch worked, but the buildfarm member seems to provide counter-proof.
>>     
>
> Ugh.  So probably it depends on just which version of Tru64 you're using
> :-(.  Maybe earlier versions of Tru64 have a broken getaddrinfo and it's
> fixed in later ones?  How would we tell the difference?
>   

I have done some more digging on this. The buildfarm member had a couple 
of configuration issues which I have remedied, and which almost 
certainly account for the float test errors we saw. However, we still 
get an error when we try to start the installed s/w with the default 
listen_addresses:
 LOG:  could not translate host name "localhost", service "5832" to address: servname not supported for ai_socktype

Of course, this won't be seen with "make check", since it starts on Unix 
with listen_addresses='', which means we never even look for any sort of 
TCP addrinfo.

I found a hint on the web that we should use -D_SOCKADDR_LEN. I tried 
this, but got a link failure, complaining about revc and send. This man 
page extract explains:
 [Tru64 UNIX]   The recv() function is identical to the recvfrom() function with a zero-valued address_len parameter,
andto the read() function if no flags are used.  For that reason the recv() function is disabled when 4.4BSD behavior
isenabled; that is, when the _SOCKADDR_LEN compile-time option is defined.

I'd like to know some settings that we can use that will get Tru64 
cleanly through the buildfarm set. If noone offers any, I propose that 
we revert the getaddrinfo() test in configure and use our own on Tru64 
until they do.

cheers

andrew

Re: Tru64/Alpha problems

От

Andrew Dunstan

Дата:

05 апреля 2006 г., 14:07:22

I wrote: 
>
> I have done some more digging on this. The buildfarm member had a 
> couple of configuration issues which I have remedied, and which almost 
> certainly account for the float test errors we saw. However, we still 
> get an error when we try to start the installed s/w with the default 
> listen_addresses:
>
>  LOG:  could not translate host name "localhost", service "5832" to 
> address: servname not supported for ai_socktype
>
> Of course, this won't be seen with "make check", since it starts on 
> Unix with listen_addresses='', which means we never even look for any 
> sort of TCP addrinfo.
>
> I found a hint on the web that we should use -D_SOCKADDR_LEN. I tried 
> this, but got a link failure, complaining about revc and send. This 
> man page extract explains:
>
>  [Tru64 UNIX]   The recv() function is identical to the recvfrom() 
> function
>  with a zero-valued address_len parameter, and to the read() function 
> if no
>  flags are used.  For that reason the recv() function is disabled when
>  4.4BSD behavior is enabled; that is, when the _SOCKADDR_LEN compile-time
>  option is defined.
>
> I'd like to know some settings that we can use that will get Tru64 
> cleanly through the buildfarm set. If noone offers any, I propose that 
> we revert the getaddrinfo() test in configure and use our own on Tru64 
> until they do.
>

I have not had any response to this. Is there any objection to my 
reverting the configure changes for the head and 8.1 branches? If not I 
intend to do that around the end of the week.

cheers

andrew

Re: Tru64/Alpha problems

От

Tom Lane

Дата:

05 апреля 2006 г., 14:32:36

Andrew Dunstan <andrew@dunslane.net> writes:
>> I'd like to know some settings that we can use that will get Tru64 
>> cleanly through the buildfarm set. If noone offers any, I propose that 
>> we revert the getaddrinfo() test in configure and use our own on Tru64 
>> until they do.

> I have not had any response to this. Is there any objection to my 
> reverting the configure changes for the head and 8.1 branches?

Presumably, whoever was complaining beforehand will come back ...
but I don't remember who that was.
        regards, tom lane

Re: Tru64/Alpha problems

От

Hans-Jürgen Schönig

Дата:

08 апреля 2006 г., 06:26:16

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> 
>>>I'd like to know some settings that we can use that will get Tru64 
>>>cleanly through the buildfarm set. If noone offers any, I propose that 
>>>we revert the getaddrinfo() test in configure and use our own on Tru64 
>>>until they do.
> 
> 
>>I have not had any response to this. Is there any objection to my 
>>reverting the configure changes for the head and 8.1 branches?
> 
> 
> Presumably, whoever was complaining beforehand will come back ...
> but I don't remember who that was.
> 
>             regards, tom lane
> 


i think the issue you are referring to comes from a Solaris report.
some patch levels of solaris have seriously broken getaddrinfo(). in 
this case pg_hba.conf cannot be read anymore.
we got a similar report some time ago. we did a simple configure tweak 
to make sure that the onboard function is used. it seems to happen only 
on some strange patchlevel (god knows which ones).
best regards,
    hans


-- 
Cybertec Geschwinde & Schönig GmbH
Schöngrabern 134; A-2020 Hollabrunn
Tel: +43/1/205 10 35 / 340
www.postgresql.at, www.cybertec.at

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Tru64/Alpha problems