Re: Change GUC hashtable to use simplehash?

Поиск
Список
Период
Сортировка
От John Naylor
Тема Re: Change GUC hashtable to use simplehash?
Дата
Msg-id CANWCAZYP_BW2cCJkFYppVgC=4y7PJDZ7zc7SNWXyEuYhc0YMmA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Change GUC hashtable to use simplehash?  (John Naylor <johncnaylorls@gmail.com>)
Ответы Re: Change GUC hashtable to use simplehash?  (jian he <jian.universality@gmail.com>)
Список pgsql-hackers
On Wed, Dec 20, 2023 at 1:48 PM John Naylor <johncnaylorls@gmail.com> wrote:
>
> On Wed, Dec 20, 2023 at 3:23 AM Jeff Davis <pgsql@j-davis.com> wrote:
> >
> > The reason I looked here is that the inner while statement (to find the
> > chunk size) looked out of place and possibly slow, and there's a
> > bitwise trick we can use instead.
>
> There are other bit tricks we can use. In v11-0005 Just for fun, I
> translated a couple more into C from
>
> https://github.com/openbsd/src/blob/master/lib/libc/arch/amd64/string/strlen.S

I wanted to see if this gets us anything so ran a couple microbenchmarks.

0001-0003 are same as earlier
0004 takes Jeff's idea and adds in an optimization from NetBSD's
strlen (I said OpenBSD earlier, but it goes back further). I added
stub code to simulate big-endian when requested at compile time, but a
later patch removes it. Since it benched well, I made the extra effort
to generalize it for other callers. After adding to the hash state, it
returns the length so the caller can pass it to the finalizer.
0005 is the benchmark (not for commit) -- I took the parser keyword
list and added enough padding to make every string aligned when the
whole thing is copied to an alloc'd area.

Each of the bench_*.sql files named below are just running the
similarly-named function, all with the same argument, e.g. "select *
from bench_pgstat_hash_fh(100000);", so not attached.

Strings:

-- strlen + hash_bytes
pgbench -n -T 20 -f bench_hash_bytes.sql -M prepared | grep latency
latency average = 1036.732 ms

-- word-at-a-time hashing, with bytewise lookahead
pgbench -n -T 20 -f bench_cstr_unaligned.sql -M prepared | grep latency
latency average = 664.632 ms

-- word-at-a-time for both hashing and lookahead (Jeff's aligned
coding plus a technique from NetBSD strlen)
pgbench -n -T 20 -f bench_cstr_aligned.sql -M prepared | grep latency
latency average = 436.701 ms

So, the fully optimized aligned case is worth it if it's convenient.

0006 adds a byteswap for big-endian so we can reuse little endian
coding for the lookahead.

0007 - I also wanted to put numbers to 0003 (pgstat hash). While the
motivation for that was cleanup, I had a hunch it would shave cycles
and take up less binary space. It does on both accounts:

-- 3x murmur + hash_combine
pgbench -n -T 20 -f bench_pgstat_orig.sql -M prepared | grep latency
latency average = 333.540 ms

-- fasthash32 (simple call, no state setup and final needed for a single value)
pgbench -n -T 20 -f bench_pgstat_fh.sql -M prepared | grep latency
latency average = 277.591 ms

0008 - We can optimize the tail load when it's 4 bytes -- to save
loads, shifts, and OR's. My compiler can't figure this out for the
pgstat hash, with its fixed 4-byte tail. It's pretty simple and should
help other cases:

pgbench -n -T 20 -f bench_pgstat_fh.sql -M prepared | grep latency
latency average = 226.113 ms

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: [PoC] Improve dead tuple storage for lazy vacuum
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: "pgoutput" options missing on documentation