Обсуждение: Why is PGStream.flush() taking so much time?
Due to performance problems profiling showed the PGStream.flush() - thats pg_output buffered Stream - takes the most time although we do not really send much sql-queries at that time, at least we get much more data than we send. System and database is also almost idle at that time. Thanks for any hint. Guido
Guido Fiala wrote: > Due to performance problems profiling showed the PGStream.flush() - thats > pg_output buffered Stream - takes the most time although we do not really > send much sql-queries at that time, at least we get much more data than we > send. System and database is also almost idle at that time. Can you show us the whole profile? I can't see why flush() should be expensive at all. Are you sure that you are not counting time spent blocked in native code as busy time? (hprof likes to do that) -O
Am Montag, 15. November 2004 20:15 schrieben Sie: > Guido Fiala wrote: > > Due to performance problems profiling showed the PGStream.flush() - thats > > pg_output buffered Stream - takes the most time although we do not really > > send much sql-queries at that time, at least we get much more data than > > we send. System and database is also almost idle at that time. > > Can you show us the whole profile? I can't see why flush() should be > expensive at all. Are you sure that you are not counting time spent > blocked in native code as busy time? (hprof likes to do that) Used jtreeprofiler for that, and yes, after explizitely including java.lang.Object things look different (although flush still shows up and some other expensive things are there too): method calls time [ms] Object.wait 109 6035.474 PGStream.ReceiveChar 789 1368.841 AbstractJdbc1ResultSet.findColumn 2923 425.492 PGStream.flush 80 416.933 PGStream.-init- 2 388.863 PGStream.ReceiveIntegerR 9418 353.866 PGStream.Receive 11452 199.189 Encoding.decodeUTF8 3953 117.751 Object.hashCode 4900 90.403 AbstractJdbc1ResultSet.toInt 1029 85.886 Encoding.encode 77 69.640 AbstractJdbc1Statement.replaceProcessing 66 60.507 AbstractJdbc1ResultSet.toBoolean 322 56.763 AbstractJdbc1ResultSet.toLong 535 50.175 Encoding.decode 6390 47.275 AbstractJdbc1ResultSet.checkResultSet 3820 43.495 AbstractJdbc1ResultSet.getFixedString 1726 38.374
Guido Fiala wrote: > Am Montag, 15. November 2004 20:15 schrieben Sie: > >>Guido Fiala wrote: >> >>>Due to performance problems profiling showed the PGStream.flush() - thats >>>pg_output buffered Stream - takes the most time although we do not really >>>send much sql-queries at that time, at least we get much more data than >>>we send. System and database is also almost idle at that time. >> >>Can you show us the whole profile? I can't see why flush() should be >>expensive at all. Are you sure that you are not counting time spent >>blocked in native code as busy time? (hprof likes to do that) > > Used jtreeprofiler for that, and yes, after explizitely including > java.lang.Object things look different (although flush still shows up and > some other expensive things are there too): I notice that jtreeprofiler appears to use the JVPMI GetCurrentThreadCpuTime() call to do its profiling measurements. I've found that this is quite unreliable in unpredicable ways -- you get random CPU time charged to your thread for no apparent reason. I gave up on building a profiler around it, it was so unreliable :( It seems that there are weird things that happens such as the full CPU cost of any GCs that occur being charged to the thread that happens to be doing the allocation that triggers the GC, which massively skews the profile. Anyway.. > method calls time [ms] > Object.wait 109 6035.474 There is no way that Object.wait() should take ~60ms per call. I don't trust this profile :( > PGStream.ReceiveChar 789 1368.841 Again this is highly suspect, all this does is read a single character from a buffered input stream and that should not take 2ms even if you're running in fully interpreted mode. > AbstractJdbc1ResultSet.findColumn 2923 425.492 Which driver version are you using? There was a recent fix to findColumn that should improve performance a lot if you are using findColumn() frequently on wide result sets. > PGStream.flush 80 416.933 Again, very suspect, since all this does is flush the output stream, which at worst should turn into a write call on the underlying socket. That should not take 5ms per call. [...] It would be useful to know what JVM, JVM version, JDBC driver version, and CPU/OS you are using for profiling.. but I really don't trust these profiling results so far. I've had some success with hprof's cpu=samples mode, some magic JVM options to turn off JIT inlining, and postprocessing of the output to eliminate time spent in methods that are known to be "mostly blocking" e.g. read(), select(). I can probably send you my Python analysis script if you want to go down that route. I've also had success with OProfile plus some nasty hacks to push information about JIT compilation into the OProfile output so you get symbols for JITed code that lives in anonymous mmapped regions.. but that is considerably less releasable. -O
Am Mittwoch, 17. November 2004 01:00 schrieben Sie: > Guido Fiala wrote: > > Used jtreeprofiler for that, and yes, after explizitely including > > java.lang.Object things look different (although flush still shows up and > > some other expensive things are there too): > > I notice that jtreeprofiler appears to use the JVPMI > GetCurrentThreadCpuTime() call to do its profiling measurements. I've > found that this is quite unreliable in unpredicable ways -- you get > random CPU time charged to your thread for no apparent reason. I gave up > on building a profiler around it, it was so unreliable :( It seems that > there are weird things that happens such as the full CPU cost of any GCs > that occur being charged to the thread that happens to be doing the > allocation that triggers the GC, which massively skews the profile. Oh! Didn't hear about that. I started using "hprof" but found the output hard to interpret (the top ~40 are just cryptically named? and took most time) Ok, currently running: AMD Duron 1GHz, 512 MB Ram 2.6.7-gentoo-r11 java -version "1.4.2_04 mixed mode", Sun (no special options to the jvm) postgresql-7.3.6-r2 jdbc-cvs-snapshot from 20040312 > > > method calls time [ms] > > Object.wait 109 6035.474 > > There is no way that Object.wait() should take ~60ms per call. I don't > trust this profile :( Well - if it waits 60ms for the socket? > > > PGStream.ReceiveChar 789 1368.841 > > Again this is highly suspect, all this does is read a single character > from a buffered input stream and that should not take 2ms even if you're > running in fully interpreted mode. > > > AbstractJdbc1ResultSet.findColumn 2923 425.492 > > Which driver version are you using? There was a recent fix to findColumn > that should improve performance a lot if you are using findColumn() > frequently on wide result sets. Ok, i almost certainly don't have this, one thing to try (but i have only around 10..20 columns in my ResultSets so far). > > > PGStream.flush 80 416.933 > > Again, very suspect, since all this does is flush the output stream, > which at worst should turn into a write call on the underlying socket. > That should not take 5ms per call. Exactly - i looked through the source code (nothing there but calls) , if theres something taking time its in th jvm/native-code or its just bogus. > > [...] > > It would be useful to know what JVM, JVM version, JDBC driver version, > and CPU/OS you are using for profiling.. but I really don't trust these > profiling results so far. > > I've had some success with hprof's cpu=samples mode, some magic JVM > options to turn off JIT inlining, and postprocessing of the output to > eliminate time spent in methods that are known to be "mostly blocking" > e.g. read(), select(). I can probably send you my Python analysis script > if you want to go down that route. I would appriaciate it - still looking for a reliable way to profile... How about java 1.5x ? Guido
Ok, some more try, here is what java -Xrunhprof:cpu=samples,depth=40 tells me (application kept busy by user input, over all it run about 30 seconds): 1 59.12% 59.12% 5891 556 sun.awt.motif.MToolkit.run 2 2.25% 61.36% 224 2142 java.net.SocketInputStream.socketRead0 3 1.71% 63.07% 170 2302 java.net.SocketInputStream.socketRead0 4 0.57% 63.64% 57 2292 java.net.SocketInputStream.socketRead0 5 0.55% 64.19% 55 1813 java.net.SocketOutputStream.socketWrite0 6 0.50% 64.70% 50 2036 java.net.SocketInputStream.socketRead0 7 0.48% 65.18% 48 2029 sun.awt.motif.MGlobalCursorManager.getCursorPos 8 0.40% 65.58% 40 549 sun.awt.motif.MToolkit.init 9 0.38% 65.96% 38 1566 sun.awt.font.NativeFontWrapper.getCharMetrics 10 0.30% 66.26% 30 1003 f.mb.setConnection 11 0.28% 66.54% 28 390 org.postgresql.jdbc2.Jdbc2Statement.createResultSet 12 0.22% 66.76% 22 1383 org.postgresql.jdbc2.Jdbc2Connection.getMetaData 13 0.22% 66.98% 22 2353 java.net.SocketInputStream.socketRead0 14 0.22% 67.21% 22 2290 java.net.SocketInputStream.socketRead0 15 0.21% 67.42% 21 362 org.postgresql.jdbc2.Jdbc2Connection.createStatement However, it took about 1..3 seconds after each keypress in the application until the data where displayed in the form (about 10..50 records with 10 columns average) - it feels so slow. And that although only the motif takes really some time.