Обсуждение: [PATCH] Add pg_lfind8_nonzero()
Hi hackers,
I'd like to add pg_lfind8_nonzero() to optimize some code like this:
```
for (i = 0; i < numberOfAttributes; i++)
{
if (isnull[i])
{
hasnull = true;
break;
}
}
```
With pg_lfind8_nonzero(), we can write the code like this:
```
if (likely(numberOfAttributes > 0))
hasnull = pg_lfind8_nonzero((uint8 *) isnull, numberOfAttributes);
```
The pg_lfind8_nonzero() is faster because we can handle 8 bool values at a time.
v1-0001 add the pg_lfind8_nonzero().
v1-0002 use the pg_lfind8_nonzero() in some places.
v1-0003 add a extension "test_patch" only for testing, the test like this:
```
for (i = 0; i < 1000; i++)
{
for (int j = 0; j < natts; j++)
{
if (isnull[j])
{
hasnull = true;
break;
}
}
}
=======================
for (i = 0; i < 1000; i++)
{
if (likely(natts > 0))
hasnull = pg_lfind8_nonzero((uint8 *) isnull, natts);
}
```
create extension test_patch;
# 4 is the natts
select test_head(4);
select test_patch(4);
Test result:
natts: 4 head: 1984ns patch: 2094ns
natts: 8 head: 3196ns patch: 641ns
natts: 16 head: 4589ns patch: 752ns
natts: 32 head: 8036ns patch: 1152ns
natts: 64 head: 19367ns patch: 2455ns
natts: 128 head: 33445ns patch: 4018ns
Looking forward to your comments!
--
Regards,
ChangAo Chen
Вложения
Hi,
On 2025-12-14 21:33:00 +0800, cca5507 wrote:
> Hi hackers,
>
> I'd like to add pg_lfind8_nonzero() to optimize some code like this:
>
> ```
> for (i = 0; i < numberOfAttributes; i++)
> {
> if (isnull[i])
> {
> hasnull = true;
> break;
> }
> }
> ```
Is code like this ever actually relevant for performance? In cases where the
compiler can't also optimize the code? Unless it is a legitimate bottleneck,
adding more complicated code to optimize the case doesn't make sense.
> create extension test_patch;
> # 4 is the natts
> select test_head(4);
> select test_patch(4);
>
> Test result:
> natts: 4 head: 1984ns patch: 2094ns
> natts: 8 head: 3196ns patch: 641ns
> natts: 16 head: 4589ns patch: 752ns
> natts: 32 head: 8036ns patch: 1152ns
> natts: 64 head: 19367ns patch: 2455ns
> natts: 128 head: 33445ns patch: 4018ns
>
> Looking forward to your comments!
>
The benchmark really should show performance benefits in at least somewhat
realistic cases (i.e. real users of the functions, even if the workload is a
bit more extreme than it'd be most of the time).
Greetings,
Andres
Hi, Thank you for your reply! > Is code like this ever actually relevant for performance? In cases where the > compiler can't also optimize the code? Unless it is a legitimate bottleneck, > adding more complicated code to optimize the case doesn't make sense. I just think the patch isn't that complicated. > The benchmark really should show performance benefits in at least somewhat > realistic cases (i.e. real users of the functions, even if the workload is a > bit more extreme than it'd be most of the time). Yeah, this makes sense to me. Maybe I need to think more on it. -- Regards, ChangAo Chen