Обсуждение: psqlODBC and IPv6
Hi, does anyone know why 08.03.0200 fixed connecting to a server via an IPv6 socket? 08.03.0100 is in Fedora 9 and connection is very unreliable: https://bugzilla.redhat.com/show_bug.cgi?id=462312 The diff between .0100 and .0200 reveals only a thinko fix in socket.c: ======================================= @@ -472,7 +485,7 @@ static int SOCK_wait_for_ready(SocketCla tm.tv_sec = retry_count; tm.tv_usec = 0; } - ret = select((int) socket + 1, output ? NULL : &fds, output ? &fds : NULL, &except_fds, no_timeout ? NULL : &tm); + ret = select((int) sock->socket + 1, output ? NULL : &fds, output ? &fds : NULL, &except_fds, no_timeout ? NULL : &tm); gerrno = SOCK_ERRNO; } while (ret < 0 && EINTR == gerrno); if (retry_count < 0) ======================================= There's no "socket" variable around there or even declared globally, only the address of socket(2) function. But why would this "select (address + 1, ...)" which is most likely larger than the number of open files cause protocol errors in IPv6 but not in IPv4? There seems to be no IPv6 specific in the changes between 08.03.0100 and .0200. Thanks in advance, Zoltán Böszörményi -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH http://www.postgresql.at/
* Zoltan Boszormenyi (zb@cybertec.at) wrote: > does anyone know why 08.03.0200 fixed connecting to a server > via an IPv6 socket? 08.03.0100 is in Fedora 9 and connection is > very unreliable: https://bugzilla.redhat.com/show_bug.cgi?id=462312 Have you considered that the version of libpq might be different? Thanks, Stephen
Вложения
Stephen Frost írta: > * Zoltan Boszormenyi (zb@cybertec.at) wrote: > >> does anyone know why 08.03.0200 fixed connecting to a server >> via an IPv6 socket? 08.03.0100 is in Fedora 9 and connection is >> very unreliable: https://bugzilla.redhat.com/show_bug.cgi?id=462312 >> > > Have you considered that the version of libpq might be different? > > Thanks, > > Stephen > No, I haven't because libpq is the same, i.e. the system provided libpq.so.5.1 from PostgreSQL 8.3.3 in this case. Only the psqlODBC version is different. And psqlODBC was very vocal about rebasing itself to libpq in the 08.xx.yyy series so it automagically gets new protocol features/fixes. "psql -h ::1 ..." works without any problem, "isql MyDSN ..." is stable with psqlODBC 08.03.0200, unstable with 08.03.0100. -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH http://www.postgresql.at/
Zoltan Boszormenyi írta: > Stephen Frost írta: > >> * Zoltan Boszormenyi (zb@cybertec.at) wrote: >> >> >>> does anyone know why 08.03.0200 fixed connecting to a server >>> via an IPv6 socket? 08.03.0100 is in Fedora 9 and connection is >>> very unreliable: https://bugzilla.redhat.com/show_bug.cgi?id=462312 >>> >>> >> Have you considered that the version of libpq might be different? >> >> Thanks, >> >> Stephen >> >> > > No, I haven't because libpq is the same, i.e. the system > provided libpq.so.5.1 from PostgreSQL 8.3.3 in this case. > Only the psqlODBC version is different. > And psqlODBC was very vocal about rebasing itself to libpq > in the 08.xx.yyy series so it automagically gets new protocol > features/fixes. > "psql -h ::1 ..." works without any problem, > "isql MyDSN ..." is stable with psqlODBC 08.03.0200, > unstable with 08.03.0100. > I just tested 08.03.0100 compiled with the difference quoted in my original mail. That single change made it work reliably over IPv6. It must be a very subtle side-effect of that typo. -- ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH http://www.postgresql.at/
Zoltan Boszormenyi <zb@cybertec.at> writes: > I just tested 08.03.0100 compiled with the difference quoted > in my original mail. That single change made it work reliably > over IPv6. It must be a very subtle side-effect of that typo. Huh ... the IPv6 interaction is still obscure, but the .0100 code is indubitably broken here. Now that I look at it, there isn't any local or global variable named "socket" in this function. The compiler must have interpreted "socket" as being the address of the socket() function! ... so who knows what that comes out as being? Proof is found by noting the warning on that line: [psqlodbc-08.03.0100]$ make socket.o gcc -DHAVE_CONFIG_H -I. -I/usr/include -I/home/tgl/testversion/include -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions-fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -c socket.c socket.c:178: warning: initialization from incompatible pointer type socket.c: In function 'format_sockerr': socket.c:198: warning: cast to pointer from integer of different size socket.c: In function 'SOCK_wait_for_ready': socket.c:475: warning: cast from pointer to integer of different size ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ socket.c:455: warning: 'no_timeout' may be used uninitialized in this function I guess one possible theory is that SOCK_wait_for_ready() is simply broken in .0100, but the kernel happens not to return EWOULDBLOCK for local IPv4 connections whereas it sometimes does for IPv6? BTW, isn't anyone paying attention to compiler warnings in this code base? It looks to me like at least two of the three other warnings in this file are also evidence of genuine bugs. A look through the build log shows an uncomfortably large number of nontrivial-looking warnings in other files, too. regards, tom lane