Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
| От | Sergey Koposov |
|---|---|
| Тема | Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile |
| Дата | |
| Msg-id | alpine.LRH.2.02.1205242008390.14366@calx046.ast.cam.ac.uk обсуждение исходный текст |
| Ответ на | Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile (Robert Haas <robertmhaas@gmail.com>) |
| Список | pgsql-hackers |
On Thu, 24 May 2012, Robert Haas wrote: > As you can see, raw performance isn't much worse with the larger data > sets, but scalability at high connection counts is severely degraded > once the working set no longer fits in shared_buffers. Actually the problem persits even when I trim the dataset size to be within the shared_buffers. Here is the dump (0.5 gig in size, tested with shared_buffers=10G, work_mem=500Mb): http://www.ast.cam.ac.uk/~koposov/files/dump.gz And I attach the script For my toy dataset the performance of a single thread goes down from ~6.4 to 18 seconds (~ 3 times worse), And actually while running the script repeatedly on my main machine, for some reason I saw some variation in terms of how much threaded execution is slower than a single thread. Now I see 25 seconds for multi threaded run vs the same ~ 6 second for a single thread. The oprofile shows 782355 21.5269 s_lock 782355 100.000 s_lock [self] ------------------------------------------------------------------------------- 709801 19.5305 PinBuffer 709801 100.000 PinBuffer [self] ------------------------------------------------------------------------------- 326457 8.9826 LWLockAcquire 326457 100.000 LWLockAcquire [self] ------------------------------------------------------------------------------- 309437 8.5143 UnpinBuffer 309437 100.000 UnpinBuffer [self] ------------------------------------------------------------------------------- 252972 6.9606 ReadBuffer_common 252972 100.000 ReadBuffer_common [self] ------------------------------------------------------------------------------- 201558 5.5460 LockBuffer 201558 100.000 LockBuffer [self] ------------------------------------------------------------ It is interesting that On another machine with much smaller shared memory (3G), smaller RAM (12G), smaller number of cpus and PG 9.1 running I was getting consistently ~ 7.2 vs 4.5 sec (for multi vs single thread) PS Just in case the CPU on the main machine I'm testing is Xeon(R) CPU E7- 4807 (the total number of real cores is 24) ***************************************************** Sergey E. Koposov, PhD, Research Associate Institute of Astronomy, University of Cambridge Madingley road, CB3 0HA, Cambridge, UK Tel: +44-1223-337-551 Web: http://www.ast.cam.ac.uk/~koposov/
В списке pgsql-hackers по дате отправления: