Обсуждение: optimize atomic exchanges

Поиск
Список
Период
Сортировка

optimize atomic exchanges

От
Nathan Bossart
Дата:
On Fri, Nov 10, 2023 at 08:55:29PM -0600, Nathan Bossart wrote:
> On Fri, Nov 10, 2023 at 06:48:39PM -0800, Andres Freund wrote:
>> Yes. We should optimize pg_atomic_exchange_u32() one of these days - it can be
>> done *far* faster than a cmpxchg. When I was adding the atomic abstraction
>> there was concern with utilizing too many different atomic instructions. I
>> didn't really agree back then, but these days I really don't see a reason to
>> not use a few more intrinsics.
> 
> I might give this a try, if for no other reason than it'd force me to
> improve my mental model of this stuff.  :)

Here's a first draft.  I haven't attempted to add implementations for
PowerPC, and I only added the __atomic version for gcc since
__sync_lock_test_and_set() only supports setting the value to 1 on some
platforms.  Otherwise, I tried to add specialized atomic exchange
implementations wherever there existed other specialized atomic
implementations.

I haven't done any sort of performance testing on this yet.  Some
preliminary web searches suggest that there is unlikely to be much
difference between cmpxchg and xchg, but presumably there's some difference
between xchg and doing cmpxchg in a while loop (as is done in
atomics/generic.h today).  I'll report back once I've had a chance to do
some testing...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Вложения

Re: optimize atomic exchanges

От
Nathan Bossart
Дата:
On Wed, Nov 29, 2023 at 03:29:05PM -0600, Nathan Bossart wrote:
> I haven't done any sort of performance testing on this yet.  Some
> preliminary web searches suggest that there is unlikely to be much
> difference between cmpxchg and xchg, but presumably there's some difference
> between xchg and doing cmpxchg in a while loop (as is done in
> atomics/generic.h today).  I'll report back once I've had a chance to do
> some testing...

Some rudimentary tests show a >40% speedup with this patch on x86_64.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: optimize atomic exchanges

От
Andres Freund
Дата:
Hi,

On 2023-11-30 21:18:15 -0600, Nathan Bossart wrote:
> On Wed, Nov 29, 2023 at 03:29:05PM -0600, Nathan Bossart wrote:
> > I haven't done any sort of performance testing on this yet.  Some
> > preliminary web searches suggest that there is unlikely to be much
> > difference between cmpxchg and xchg, but presumably there's some difference
> > between xchg and doing cmpxchg in a while loop (as is done in
> > atomics/generic.h today).  I'll report back once I've had a chance to do
> > some testing...
> 
> Some rudimentary tests show a >40% speedup with this patch on x86_64.

On bigger machines, with contention, the wins are likely much higher. I see
two orders of magnitude higher throughput in a test program that I had around,
on a two socket cascade lake machine.  Of course it's also much less
powerfull...

Greetings,

Andres Freund



Re: optimize atomic exchanges

От
Nathan Bossart
Дата:
On Thu, Nov 30, 2023 at 07:56:27PM -0800, Andres Freund wrote:
> On 2023-11-30 21:18:15 -0600, Nathan Bossart wrote:
>> Some rudimentary tests show a >40% speedup with this patch on x86_64.
> 
> On bigger machines, with contention, the wins are likely much higher. I see
> two orders of magnitude higher throughput in a test program that I had around,
> on a two socket cascade lake machine.  Of course it's also much less
> powerfull...

Nice.  Thanks for trying it out.

One thing on my mind is whether we should bother with the inline assembly
versions.  It looks like gcc has had __atomic since 4.7.0 (2012), so I'm
not sure we gain much from them.  OTOH they are pretty simple and seem
unlikely to cause too much trouble.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: optimize atomic exchanges

От
Nathan Bossart
Дата:
On Thu, Nov 30, 2023 at 10:35:22PM -0600, Nathan Bossart wrote:
> One thing on my mind is whether we should bother with the inline assembly
> versions.  It looks like gcc has had __atomic since 4.7.0 (2012), so I'm
> not sure we gain much from them.  OTOH they are pretty simple and seem
> unlikely to cause too much trouble.

Barring objections or additional feedback, I think I'm inclined to press
forward with this one and commit it in the next week or two.  I'm currently
planning to keep the inline assembly, but I'm considering removing the
configuration checks for __atomic_exchange_n() if the availability of
__atomic_compare_exchange_n() seems like a reliable indicator of its
presence.  Thoughts?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: optimize atomic exchanges

От
Nathan Bossart
Дата:
On Mon, Dec 04, 2023 at 12:18:05PM -0600, Nathan Bossart wrote:
> Barring objections or additional feedback, I think I'm inclined to press
> forward with this one and commit it in the next week or two.  I'm currently
> planning to keep the inline assembly, but I'm considering removing the
> configuration checks for __atomic_exchange_n() if the availability of
> __atomic_compare_exchange_n() seems like a reliable indicator of its
> presence.  Thoughts?

Concretely, like this.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Вложения

Re: optimize atomic exchanges

От
Andres Freund
Дата:
Hi,

On 2023-12-04 15:08:57 -0600, Nathan Bossart wrote:
> On Mon, Dec 04, 2023 at 12:18:05PM -0600, Nathan Bossart wrote:
> > Barring objections or additional feedback, I think I'm inclined to press
> > forward with this one and commit it in the next week or two.  I'm currently
> > planning to keep the inline assembly, but I'm considering removing the
> > configuration checks for __atomic_exchange_n() if the availability of
> > __atomic_compare_exchange_n() seems like a reliable indicator of its
> > presence.  Thoughts?

I don't think we need the inline asm. Otherwise looks good.

Greetings,

Andres Freund



Re: optimize atomic exchanges

От
Nathan Bossart
Дата:
On Fri, Dec 15, 2023 at 04:56:27AM -0800, Andres Freund wrote:
> I don't think we need the inline asm. Otherwise looks good.

Committed with that change.  Thanks for reviewing!  I am going to watch the
buildfarm especially closely for this one.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com