Обсуждение: Establishing remote connections is slow
Hi, I have a very weird problem related to establishing remote connections to PostgreSQL server and hopefully someone can give me some hints how can I debug this. The essence is that establishing remote connection takes anywhere from 10 to 30 seconds. Once connected, the queries are fast - it's just establishing new connection that takes ages. This problem is not applicable to establishing local connections: running psql command from the local machine takes no time to connect, same applies if a client connects to the PostgreSQL via ssh tunnel. Immediately after restarting PostgreSQL daemon, the problem temporarily goes away but later resurfaces again. Things we have tried: - doing all sorts of DNS queries against the connecting client IP -- seems to be fine, DNS resolution takes no time; - enabling debug for HA (http://docs.oracle.com/cd/E19680-01/html/821-1534/fumuy.html#scrolltoc) -- debugging showed no problems. We were probing for the problem described in by http://blogs.oracle.com/js/entry/the_nscd_does_not_cache - asking PostgreSQL to listen not only on multipath IP (used for failover), but also on an ethernet interface. This is the most interesting. When remote connection via multipath IP is slow to establish, establishing remote connections via ethernet interface is still snappy. At this point it is reasonable to think that the problem lies somewhere in the networking (multipath IP), and that well might be true. But we tried running simple netcat server-client and it was all instant via both interfaces (multipath and eth). Can anyone suggest any ideas how to debug this further? Many thanks in advance. Environment: - Solaris 5.10 / Intel - Sun cluster - HA for PostgreSQL (http://docs.oracle.com/cd/E19680-01/html/821-1534/cacjgdbc.html#scrolltoc) - PostgreSQL server version: 9.0.4 Regards, Mindaugas
=?UTF-8?Q?Mindaugas_=C5=BDak=C5=A1auskas?= <mindas@gmail.com> writes: > I have a very weird problem related to establishing remote connections > to PostgreSQL server and hopefully someone can give me some hints how > can I debug this. > The essence is that establishing remote connection takes anywhere from > 10 to 30 seconds. Once connected, the queries are fast - it's just > establishing new connection that takes ages. Perhaps the problem is related to authentication - what auth mode are you using, and can you experiment with some other ones? What I'd do to start debugging this is to get out a packet sniffer (wireshark or some such) and just observe the timings of packets sent and received by Postgres. This would at least give you a hint which step is the bottleneck. > This problem is not > applicable to establishing local connections: running psql command > from the local machine takes no time to connect, same applies if a > client connects to the PostgreSQL via ssh tunnel. What about "psql -h localhost", ie physically local connection but via TCP not unix socket? regards, tom lane
Hi Tom, Thanks for your reply. > Perhaps the problem is related to authentication - what auth mode > are you using, and can you experiment with some other ones? Excerpt from my pg_hba.conf ------------ local all all trust host all all IP1/mask1 md5 host all all IP2/mask2 md5 ------------ The IP/mask combinations are corresponding to the IP/subnet client is connecting from. Can you elaborate a bit on "experimenting"? Because I am not quite sure what changes could possibly make any difference. Also, the fact that when remotely connecting via standard ethernet IP address (rather than multipath) works fine as well as this working fine short after PostgreSQL restart, I can't see how this could be relevant. > What I'd do to start debugging this is to get out a packet sniffer > (wireshark or some such) and just observe the timings of packets sent > and received by Postgres. This would at least give you a hint which > step is the bottleneck. I have done some truss (strace alternative for Solaris) debugging and it looks like it just waits for the server side to respond. I can probably dig out where and when exactly is it waiting, but me knowing very little about PostgreSQL internals won't help much. Wireshark is probably not an option as this all happens on a live server which is connected directly to a switch. I might have a look if a tcpdump is available but chances are very limited. > What about "psql -h localhost", ie physically local connection but > via TCP not unix socket? user@dbserver> psql -h 127.0.0.1 -p5432 -U user -W db This works fast. But user@dbserver> psql -h <IP> -p5432 -U user -W db (where <IP> is the multipath interface) This is slow! So it is definitely something network-related or something how PostgreSQL deals with multipath interface. Regards, Mindaugas
=?UTF-8?Q?Mindaugas_=C5=BDak=C5=A1auskas?= <mindas@gmail.com> writes: >> What about "psql -h localhost", ie physically local connection but >> via TCP not unix socket? > user@dbserver> psql -h 127.0.0.1 -p5432 -U user -W db > This works fast. But > user@dbserver> psql -h <IP> -p5432 -U user -W db > (where <IP> is the multipath interface) > This is slow! So it is definitely something network-related or > something how PostgreSQL deals with multipath interface. Hm. AFAIR postgres doesn't know anything particular about multipath interfaces --- it just listens where you tell it to. So I'm thinking this is a system-level issue. It still seems like it could be DNS lookup related though. Do you have log_hostname turned on, and if so does turning it off make a difference? regards, tom lane
On Tue, Jan 17, 2012 at 3:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Hm. AFAIR postgres doesn't know anything particular about multipath > interfaces --- it just listens where you tell it to. I was thinking the same, but PostgreSQL is the "first line to contact" and I somehow need to obtain a proof that this is indeed a system-level issue. My simple netcat experiments seem to suggest the opposite. > So I'm thinking this is a system-level issue. It still seems like it could be DNS > lookup related though. Do you have log_hostname turned on, and if so > does turning it off make a difference? log_hostname is turned off. Thanks for your help! Regards, Mindaugas
Mindaugas Žakšauskas<mindas@gmail.com> wrote: > The essence is that establishing remote connection takes anywhere > from 10 to 30 seconds. Once connected, the queries are fast The only time I've seen something similar, there was no reverse DNS entry to go from IP address to host name. Adding that corrected the issue. I would try that. If that fixes it, the questions would be whether PostgreSQL is doing an unnecessary reverse DNS lookup. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Mindaugas �ak�auskas<mindas@gmail.com> wrote: >> The essence is that establishing remote connection takes anywhere >> from 10 to 30 seconds. Once connected, the queries are fast > The only time I've seen something similar, there was no reverse DNS > entry to go from IP address to host name. Adding that corrected the > issue. I would try that. > If that fixes it, the questions would be whether PostgreSQL is doing > an unnecessary reverse DNS lookup. Having log_hostname off is supposed to prevent us from attempting a reverse DNS lookup ... but it would be worth checking into whether one is happening anyway. (I would think though that such activity would be visible in strace/truss output. Perhaps you should turn log_hostname *on* and verify that you see the lookup activity in strace that wasn't there before.) regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> Mindaugas *ak*auskas<mindas@gmail.com> wrote: >>> The essence is that establishing remote connection takes >>> anywhere from 10 to 30 seconds. Once connected, the queries are >>> fast > >> The only time I've seen something similar, there was no reverse >> DNS entry to go from IP address to host name. Adding that >> corrected the issue. I would try that. > >> If that fixes it, the questions would be whether PostgreSQL is >> doing an unnecessary reverse DNS lookup. > > Having log_hostname off is supposed to prevent us from attempting > a reverse DNS lookup ... but it would be worth checking into > whether one is happening anyway. (I would think though that such > activity would be visible in strace/truss output. Perhaps you > should turn log_hostname *on* and verify that you see the lookup > activity in strace that wasn't there before.) Actually, where I've seen this sort of problem, it was the client code which was doing the unnecessary reverse DNS lookup. What controls this in psql? -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > Actually, where I've seen this sort of problem, it was the client > code which was doing the unnecessary reverse DNS lookup. What > controls this in psql? psql? AFAIR psql itself doesn't do any such thing. It's possible that certain libraries such as SSL or Kerberos might do an RDNS lookup internally, though. The OP showed he was using md5 (password) authentication, so we can discount authentication libraries, but I wonder whether openssl ever does DNS lookups, and if so how to control that. Mindaugas, are you using SSL, and if so can you turn it off and see whether things change? (It should be safe to do so at least on the "localhost" connection, even if you feel your network is insecure.) regards, tom lane
On Tue, Jan 17, 2012 at 7:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > <..> Mindaugas, are you using SSL, > and if so can you turn it off and see whether things change? > (It should be safe to do so at least on the "localhost" connection, > even if you feel your network is insecure.) No, I am not using SSL; it is either disabled or the default setting is off anyway. This was one of the first things I have checked. Moreover, this would probably make it hard to explain why does it take no time to establish connections immediately after PostgreSQL restart and why it does it degrade later. To respond to previous emails - we tried doing DNS lookups against the client host and they took no time. Thanks for your ideas. Regards, Mindaugas
Hi, if you try to connect with pssql over your IP and it works slow, it is possible that you have problem with reverse dns lookup. To check, try to run nslookup tool, and enter your IP address (on the same machine) - if it is problem with dns, it will resolve it slow as well.. Also, try to check traceroute/trapath utilities with the same IP maybe it will show you something strange.. -- Lukas UAB nSoft www.nsoft.lt > On Tue, Jan 17, 2012 at 7:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> <..> Mindaugas, are you using SSL, >> and if so can you turn it off and see whether things change? >> (It should be safe to do so at least on the "localhost" connection, >> even if you feel your network is insecure.) > > No, I am not using SSL; it is either disabled or the default setting > is off anyway. This was one of the first things I have checked. > Moreover, this would probably make it hard to explain why does it take > no time to establish connections immediately after PostgreSQL restart > and why it does it degrade later. > > To respond to previous emails - we tried doing DNS lookups against the > client host and they took no time. > > Thanks for your ideas. > > Regards, > Mindaugas > > -- > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-admin >
2012/1/17 Mindaugas Žakšauskas <mindas@gmail.com>
Try putting the hostnames and IP addresses in /etc/hosts ... first on the server (for the client) and then on the client (for the server).
Craig
On Tue, Jan 17, 2012 at 7:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> <..> Mindaugas, are you using SSL,
> and if so can you turn it off and see whether things change?
> (It should be safe to do so at least on the "localhost" connection,
> even if you feel your network is insecure.)
No, I am not using SSL; it is either disabled or the default setting
is off anyway. This was one of the first things I have checked.
Moreover, this would probably make it hard to explain why does it take
no time to establish connections immediately after PostgreSQL restart
and why it does it degrade later.
To respond to previous emails - we tried doing DNS lookups against the
client host and they took no time.
Try putting the hostnames and IP addresses in /etc/hosts ... first on the server (for the client) and then on the client (for the server).
Craig
Thanks for your ideas.
Regards,
Mindaugas
--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Hi, if you try to connect with pssql over your IP and it works slow, it is possible that you have problem with reverse dns lookup. To check, try to run nslookup tool, and enter your IP address (on the same machine) - if it is problem with dns, it will resolve it slow as well.. Also, try to check traceroute/trapath utilities with the same IP maybe it will show you something strange.. Lukas > On Tue, Jan 17, 2012 at 3:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Hm. AFAIR postgres doesn't know anything particular about multipath >> interfaces --- it just listens where you tell it to. > > I was thinking the same, but PostgreSQL is the "first line to contact" > and I somehow need to obtain a proof that this is indeed a > system-level issue. My simple netcat experiments seem to suggest the > opposite. > >> So I'm thinking this is a system-level issue. It still seems like it >> could be DNS >> lookup related though. Do you have log_hostname turned on, and if so >> does turning it off make a difference? > > log_hostname is turned off. > > Thanks for your help! > > Regards, > Mindaugas > > -- > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-admin >