Обсуждение: tuning/profiling

Поиск
Список
Период
Сортировка

tuning/profiling

От
Ken Geis
Дата:
Hi.  I just did some investigation into improving the performance of
the JDBC driver.  I want the best performance for my application, and I
would love to be able to make PostgreSQL more competitive.

I wrote a simple benchmark that would be doing an action common in my
application (fetch ~11000 rows from a table with a date, four
numeric(9,4) columns, and an integer.)  I ran this through a profiler
and found some hot spots.  In the end, I was able to squeeze out at
least 50% improvement with a small amount of changes to driver code.

Before I submit patches, I must run my changes against the test suite
and other benchmarks.  Then I will break my patches into independent
pieces.  Some of my changes may be controversial because they may be
less robust than current, but I am making some assumptions about the
server protocol (like that date strings will never contain Unicode
digits.)

Does anyone have a benchmark they'd like me to run?  I could run
ECPerf, PolePosition, TPC-W, or Jdbcbench.


Ken


Re: tuning/profiling

От
"Kevin Grittner"
Дата:
Ken,

If you're going to be running tests and benchmarks anyway, you
might want to consider replacing the synchronized collections
(Vector and Hashtable) with their unsynchronized counterparts
(ArrayList and HashMap).  While each synchronized block only
costs about one microsecond (in benchmarks I've run), there is
a potential hidden cost to them -- they may prevent some of the
more aggressive optimizations available to the JIT.

In case you haven't run into this issue, each thread can keep a
cached copy of any variable it uses.  It does not need to re-read
the variable from shared memory except when it references the
variable after the thread has entered any synchronized block
after the last read.  It does not need to write any modifications
to shared memory until it leaves a synchronized block.  The
profiler could show this access to shared memory as a cost
far from where the synchronized block itself, and it could affect
variables that have nothing to do with the collections.
Synchronizing on any object affects all cached variables.

This is not hypothetical; I have seen these JIT optimizations in
production environments.

As far as I'm aware, the driver is not thread-safe anyway (and
is not required to be so by the JDBC spec) except for the
Statement.cancel() method.  One would have to pay special
attention to that.

Having said all that, you're more likely to get big gains by
addressing whatever the profiler highlighted.

-Kevin


>>> Ken Geis <kgeis@speakeasy.net>  >>>
Hi.  I just did some investigation into improving the performance of
the JDBC driver.  I want the best performance for my application, and I
would love to be able to make PostgreSQL more competitive.

I wrote a simple benchmark that would be doing an action common in my
application (fetch ~11000 rows from a table with a date, four
numeric(9,4) columns, and an integer.)  I ran this through a profiler
and found some hot spots.  In the end, I was able to squeeze out at
least 50% improvement with a small amount of changes to driver code.

Before I submit patches, I must run my changes against the test suite
and other benchmarks.  Then I will break my patches into independent
pieces.  Some of my changes may be controversial because they may be
less robust than current, but I am making some assumptions about the
server protocol (like that date strings will never contain Unicode
digits.)

Does anyone have a benchmark they'd like me to run?  I could run
ECPerf, PolePosition, TPC-W, or Jdbcbench.


Ken


Re: tuning/profiling

От
Dave Cramer
Дата:
On 19-Oct-05, at 5:18 PM, Kevin Grittner wrote:

> Ken,
>
> If you're going to be running tests and benchmarks anyway, you
> might want to consider replacing the synchronized collections
> (Vector and Hashtable) with their unsynchronized counterparts
> (ArrayList and HashMap).  While each synchronized block only
> costs about one microsecond (in benchmarks I've run), there is
> a potential hidden cost to them -- they may prevent some of the
> more aggressive optimizations available to the JIT.
Vector and Hashtable are remnants from jdk1.1 and we should probably
look
at replacing them, however see my comment below about thread safety.
>
> In case you haven't run into this issue, each thread can keep a
> cached copy of any variable it uses.  It does not need to re-read
> the variable from shared memory except when it references the
> variable after the thread has entered any synchronized block
> after the last read.  It does not need to write any modifications
> to shared memory until it leaves a synchronized block.  The
> profiler could show this access to shared memory as a cost
> far from where the synchronized block itself, and it could affect
> variables that have nothing to do with the collections.
> Synchronizing on any object affects all cached variables.
>
> This is not hypothetical; I have seen these JIT optimizations in
> production environments.
>
> As far as I'm aware, the driver is not thread-safe anyway (and
> is not required to be so by the JDBC spec) except for the
> Statement.cancel() method.  One would have to pay special
> attention to that.

Huh ? any web application would be sharing connections, and the
driver itself
>
> Having said all that, you're more likely to get big gains by
> addressing whatever the profiler highlighted.
>
> -Kevin
>
>
>
>>>> Ken Geis <kgeis@speakeasy.net>  >>>
>>>>
> Hi.  I just did some investigation into improving the performance of
> the JDBC driver.  I want the best performance for my application,
> and I
> would love to be able to make PostgreSQL more competitive.
>
> I wrote a simple benchmark that would be doing an action common in my
> application (fetch ~11000 rows from a table with a date, four
> numeric(9,4) columns, and an integer.)  I ran this through a profiler
> and found some hot spots.  In the end, I was able to squeeze out at
> least 50% improvement with a small amount of changes to driver code.
>
> Before I submit patches, I must run my changes against the test suite
> and other benchmarks.  Then I will break my patches into independent
> pieces.  Some of my changes may be controversial because they may be
> less robust than current, but I am making some assumptions about the
> server protocol (like that date strings will never contain Unicode
> digits.)
>
> Does anyone have a benchmark they'd like me to run?  I could run
> ECPerf, PolePosition, TPC-W, or Jdbcbench.
>
>
> Ken
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org
>
>


Re: tuning/profiling

От
"Kevin Grittner"
Дата:
If you're talking about connection pooling, the pool should have
synchronization around obtaining and returning connections, or
you would have serious problems.  Are you talking about any
other sharing of connections among multiple threads?

-Kevin

>>> Dave Cramer <pg@fastcrypt.com>  >>>

Huh ? any web application would be sharing connections, and the
driver itself


Re: tuning/profiling

От
Dave Cramer
Дата:
I'm not talking about sharing a connection between multiple threads.
However there are
sections of the driver that do share data. Specifically the meta data

You may be right, we may not have to worry about some sections .

Dave
On 19-Oct-05, at 7:18 PM, Kevin Grittner wrote:

> If you're talking about connection pooling, the pool should have
> synchronization around obtaining and returning connections, or
> you would have serious problems.  Are you talking about any
> other sharing of connections among multiple threads?
>
> -Kevin
>
>
>>>> Dave Cramer <pg@fastcrypt.com>  >>>
>>>>
>
> Huh ? any web application would be sharing connections, and the
> driver itself
>
>
>