Re: Scaling shared buffer eviction
От | Amit Kapila |
---|---|
Тема | Re: Scaling shared buffer eviction |
Дата | |
Msg-id | CAA4eK1Je9ZBLHsfiavHD18GDwXUx21zFqPJgq_Dz_ZoA35nLpQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Scaling shared buffer eviction (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Scaling shared buffer eviction
(Andres Freund <andres@2ndquadrant.com>)
Re: Scaling shared buffer eviction (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
On Fri, Sep 26, 2014 at 7:04 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On another point, I think it would be a good idea to rebase the
> bgreclaimer patch over what I committed, so that we have a
> clean patch against master to test with.
Please find the rebased patch attached with this mail. I have taken
some performance data as well and done some analysis based on
the same.
Performance Data
----------------------------
IBM POWER-8 24 cores, 192 hardware threads
RAM = 492GB
>
> On another point, I think it would be a good idea to rebase the
> bgreclaimer patch over what I committed, so that we have a
> clean patch against master to test with.
Please find the rebased patch attached with this mail. I have taken
some performance data as well and done some analysis based on
the same.
Performance Data
----------------------------
IBM POWER-8 24 cores, 192 hardware threads
RAM = 492GB
max_connections =300
Database Locale =C
checkpoint_segments=256
checkpoint_timeout =15min
shared_buffers=8GB
scale factor = 5000
Client Count = number of concurrent sessions and threads (ex. -c 8 -j 8)
Duration of each individual run = 5mins
Below data is median of 3 runs.
Here we can see that the performance dips at higher client
Database Locale =C
checkpoint_segments=256
checkpoint_timeout =15min
shared_buffers=8GB
scale factor = 5000
Client Count = number of concurrent sessions and threads (ex. -c 8 -j 8)
Duration of each individual run = 5mins
Below data is median of 3 runs.
patch_ver/client_count | 1 | 8 | 32 | 64 | 128 | 256 |
HEAD | 18884 | 118628 | 251093 | 216294 | 186625 | 177505 |
PATCH | 18743 | 122578 | 247243 | 205521 | 179712 | 175031 |
Here we can see that the performance dips at higher client
count(>=32) which was quite surprising for me, as I was expecting
it to improve, because bgreclaimer reduces the contention by making
buffers available on free list. So I tried to analyze the situation by
using perf and found that in above configuration, there is a contention
around freelist spinlock with HEAD and the same is removed by Patch,
but still the performance goes down with Patch. On further analysis, I
observed that actually after Patch there is an increase in contention
around ProcArrayLock (shared LWlock) via GetSnapshotData which
sounds bit odd, but that's what I can see in profiles. Based on analysis,
few ideas which I would like to further investigate are:
a. As there is an increase in spinlock contention, I would like to check
with Andres's latest patch which reduces contention around shared
lwlocks.
b. Reduce some instructions added by patch in StrategyGetBuffer(),
like instead of awakening bgreclaimer at low threshold, awaken when
it tries to do clock sweep.
Thoughts?
Below is the profile data for 64 and 128 client count:
Head - 64 client count
-------------------------------------
+ 8.93% postgres postgres [.] s_lock
7.83% swapper [unknown] [H] 0x00000000011bc5ac
+ 3.09% postgres postgres [.] GetSnapshotData
+ 3.06% postgres postgres [.] tas
+ 2.49% postgres postgres [.] AllocSetAlloc
+ 2.43% postgres postgres [.] hash_search_with_hash_value
+ 2.13% postgres postgres [.] _bt_compare
Detailed Data
------------------------
- 8.93% postgres postgres [.] s_lock
- s_lock
- 4.97% s_lock
- 1.63% StrategyGetBuffer
BufferAlloc
ReadBuffer_common
ReadBufferExtended
- ReadBuffer
- 1.63% ReleaseAndReadBuffer
- 0.93% index_fetch_heap
- index_getnext
- 0.93% IndexNext
ExecScanFetch
- 0.69% _bt_relandgetbuf
_bt_search
_bt_first
btgettuple
- index_getnext
- 0.69% IndexNext
ExecScanFetch
0
- 1.39% LWLockAcquireCommon
- LWLockAcquire
- 1.38% GetSnapshotData
- GetTransactionSnapshot
- 0.70% exec_bind_message
0
- 0.68% PortalStart
exec_bind_message
- 1.37% LWLockRelease
- 1.37% GetSnapshotData
- GetTransactionSnapshot
- 0.69% exec_bind_message
0
- 0.68% PortalStart
exec_bind_message
PostgresMain
0
- 1.07% StrategyGetBuffer
- 1.06% StrategyGetBuffer
BufferAlloc
ReadBuffer_common
ReadBufferExtended
- ReadBuffer
- 1.06% ReleaseAndReadBuffer
- 0.62% index_fetch_heap
index_getnext
- 0.95% LWLockAcquireCommon
- 0.95% LWLockAcquireCommon
- LWLockAcquire
- 0.90% GetSnapshotData
GetTransactionSnapshot
- 0.94% LWLockRelease
- 0.94% LWLockRelease
- 0.90% GetSnapshotData
GetTransactionSnapshot
7.83% swapper [unknown] [H] 0x00000000011bc5ac
- 3.09% postgres postgres [.] GetSnapshotData
- GetSnapshotData
- 3.06% GetSnapshotData
- 3.06% GetTransactionSnapshot
- 1.54% PortalStart
exec_bind_message
move_buffers_to_freelist_by_bgreclaimer_v1 - 64 Client count
----------------------------------------------------------------------------------------------
+ 11.52% postgres postgres [.] s_lock
7.57% swapper [unknown] [H] 0x00000000011d9034
+ 3.54% postgres postgres [.] tas
+ 3.02% postgres postgres [.] GetSnapshotData
+ 2.47% postgres postgres [.] hash_search_with_hash_value
+ 2.33% postgres postgres [.] AllocSetAlloc
+ 2.03% postgres postgres [.] _bt_compare
+ 1.89% postgres postgres [.] calc_bucket
Detailed Data
---------------------
- 11.52% postgres postgres [.] s_lock
- s_lock
- 6.57% s_lock
- 2.72% LWLockAcquireCommon
- LWLockAcquire
- 2.71% GetSnapshotData
- GetTransactionSnapshot
- 1.38% exec_bind_message
0
- 1.33% PortalStart
exec_bind_message
0
- 2.69% LWLockRelease
- 2.69% GetSnapshotData
- GetTransactionSnapshot
- 1.35% exec_bind_message
PostgresMain
0
- 1.34% PortalStart
exec_bind_message
0
- 1.65% LWLockAcquireCommon
- 1.65% LWLockAcquireCommon
- LWLockAcquire
- 1.59% GetSnapshotData
- GetTransactionSnapshot
- 0.80% exec_bind_message
PostgresMain
0
- 0.79% PortalStart
exec_bind_message
0
- 0.79% PortalStart
exec_bind_message
0
- 1.62% LWLockRelease
- 1.62% LWLockRelease
- 1.58% GetSnapshotData
- GetTransactionSnapshot
- 0.79% exec_bind_message
PostgresMain
0
- 0.79% PortalStart
exec_bind_message
PostgresMain
0
- 0.63% hash_search_with_hash_value
- 0.63% hash_search_with_hash_value
BufTableDelete
BufferAlloc
- 0.59% get_hash_entry
- 0.59% get_hash_entry
hash_search_with_hash_value
BufTableInsert
BufferAlloc
Head - 128 Client count
---------------------------------------
+ 18.39% postgres postgres [.] s_lock
6.72% swapper [unknown] [H] 0x00000000011bc390
+ 3.37% postgres postgres [.] GetSnapshotData
+ 2.11% postgres postgres [.] tas
+ 2.05% postgres postgres [.] tas
+ 1.82% postgres postgres [.] hash_search_with_hash_value
+ 1.77% postgres postgres [.] AllocSetAlloc
1.52% postgres [unknown] [H] 0x00000000012fdc00
+ 1.45% postgres postgres [.] tas
+ 1.42% postgres postgres [.] _bt_compare
- 18.39% postgres postgres [.] s_lock
- s_lock
- 12.35% s_lock
- 7.52% StrategyGetBuffer
BufferAlloc
ReadBuffer_common
- 1.86% LWLockAcquireCommon
- LWLockAcquire
- 1.83% GetSnapshotData
- GetTransactionSnapshot
- 0.95% exec_bind_message
0
- 0.88% PortalStart
exec_bind_message
0
- 1.78% LWLockRelease
- 1.76% GetSnapshotData
- GetTransactionSnapshot
- 0.91% exec_bind_message
- 0.86% PortalStart
exec_bind_message
0
- 0.60% get_hash_entry
hash_search_with_hash_value
BufTableInsert
BufferAlloc
- 0.58% hash_search_with_hash_value
- 0.58% BufTableDelete
BufferAlloc
- 3.18% StrategyGetBuffer
- 3.18% StrategyGetBuffer
BufferAlloc
0
- 0.88% LWLockAcquireCommon
- 0.87% LWLockAcquireCommon
- LWLockAcquire
- 0.81% GetSnapshotData
GetTransactionSnapshot
- 0.84% LWLockRelease
- 0.83% LWLockRelease
- 0.79% GetSnapshotData
GetTransactionSnapshot
- 0.55% hash_search_with_hash_value
- 0.55% hash_search_with_hash_value
BufTableDelete
BufferAlloc
ReadBuffer_common
ReadBufferExtended
- ReadBuffer
0.55% ReleaseAndReadBuffer
- 0.54% get_hash_entry
- 0.54% get_hash_entry
hash_search_with_hash_value
BufTableInsert
BufferAlloc
ReadBuffer_common
ReadBufferExtended
- ReadBuffer
0.54% ReleaseAndReadBuffer
move_buffers_to_freelist_by_bgreclaimer_v1 - 128 Client count
----------------------------------------------------------------------------------------------
+ 13.64% postgres postgres [.] s_lock
8.19% swapper [unknown] [H] 0x0000000000000c04
+ 3.62% postgres postgres [.] GetSnapshotData
+ 2.40% postgres postgres [.] calc_bucket
+ 2.38% postgres postgres [.] tas
+ 2.38% postgres postgres [.] hash_search_with_hash_value
2.02% postgres [unknown] [H] 0x0000000000000f80
+ 1.73% postgres postgres [.] AllocSetAlloc
+ 1.68% postgres postgres [.] tas
Detailed Data
-----------------------
- 13.64% postgres postgres [.] s_lock
- s_lock
- 8.76% s_lock
- 3.03% LWLockAcquireCommon
- LWLockAcquire
- 2.97% GetSnapshotData
- GetTransactionSnapshot
- 1.55% exec_bind_message
0
- 1.42% PortalStart
exec_bind_message
- 2.87% LWLockRelease
- 2.82% GetSnapshotData
- GetTransactionSnapshot
- 1.46% exec_bind_message
0
- 1.36% PortalStart
exec_bind_message
0
- 1.35% get_hash_entry
hash_search_with_hash_value
BufTableInsert
BufferAlloc
0
- 1.29% hash_search_with_hash_value
- 1.29% BufTableDelete
BufferAlloc
0
- 1.19% LWLockAcquireCommon
- 1.19% LWLockAcquireCommon
- LWLockAcquire
- 1.11% GetSnapshotData
- GetTransactionSnapshot
- 0.56% exec_bind_message
PostgresMain
0
- 0.55% PortalStart
exec_bind_message
0
- 1.15% LWLockRelease
- 1.15% LWLockRelease
- 1.08% GetSnapshotData
- GetTransactionSnapshot
- 0.55% exec_bind_message
0
- 0.53% PortalStart
exec_bind_message
0
- 1.12% hash_search_with_hash_value
- 1.12% hash_search_with_hash_value
BufTableDelete
BufferAlloc
0
- 1.10% get_hash_entry
- 1.10% get_hash_entry
hash_search_with_hash_value
BufTableInsert
BufferAlloc
0
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Andres FreundДата:
Сообщение: Re: Log notice that checkpoint is to be written on shutdown