Re: Change GUC hashtable to use simplehash?

Поиск
Список
Период
Сортировка
От John Naylor
Тема Re: Change GUC hashtable to use simplehash?
Дата
Msg-id CANWCAZYGzFrLHWoWUQ3-d4WfjECXa=02ya_htGbfz_M_qNtA=w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Change GUC hashtable to use simplehash?  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: Change GUC hashtable to use simplehash?  (John Naylor <johncnaylorls@gmail.com>)
Re: Change GUC hashtable to use simplehash?  (John Naylor <johncnaylorls@gmail.com>)
Список pgsql-hackers
On Sun, Dec 10, 2023 at 2:18 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Sat, 2023-12-09 at 18:52 +0700, John Naylor wrote:
> > > I tested using the new hash function APIs for my search path cache,
> > > and
> > > there's a significant speedup for cases not benefiting from
> > > a86c61c9ee.
> > > It's enough that we almost don't need a86c61c9ee. So a definite +1
> > > to
> > > the new APIs.
> >
> > Do you have a new test?
>
> Still using the same basic test here:
>
> https://www.postgresql.org/message-id/04c8592dbd694e4114a3ed87139a7a04e4363030.camel%40j-davis.com
>
> What I did is:
>
>    a. add your v5 patches
>    b. disable optimization in a86c61c9ee
>    c. add attached patch to use new hash APIs
>
> I got a slowdown between (a) and (b), and then (c) closed the gap about
> halfway. It started to get close to test noise at that point -- I could
> get some better numbers out of it if it's helpful.

I tried my variant of the same test [1] (but only 20 seconds per run),
which uses pgbench to take the average of a few dozen runs, and
doesn't use table I/O (when doing that, it's best to pre-warm the
buffers to reduce noise).

pgbench -n -T 20 -f bench.sql -M prepared
(done three times and take the median, with turbo off)

* master at 457428d9e99b6b from Dec 4:
latency average = 571.413 ms

* v8 (bytewise hash):
latency average = 588.942 ms

This regression is a bit surprising, since there is no strlen call,
and it uses roleid as a seed without a round of mixing (not sure if we
should do that, but just trying to verify results).

* v8 with chunked interface:
latency average = 555.688 ms

This starts to improve things for me.

* v8 with chunked, and return lower 32 bits of full 64-bit hash:
latency average = 556.324 ms

This is within the noise level. There doesn't seem to be much downside
of using a couple cycles for fasthash's 32-bit reduction.

* revert back to master from Dec 4 and then cherry pick a86c61c9ee
(save last entry of SearchPathCache)
latency average = 545.747 ms

So chunked incremental hashing gets within ~2% of that, which is nice.
It seems we should use that when removing strlen, when convenient.

Updated next steps:
* Investigate whether/how to incorporate final length into the
calculation when we don't have the length up front.
* Add some desperately needed explanatory comments.
* Use this in some existing cases where it makes sense.
* Get back to GUC hash and dynahash.

[1] https://www.postgresql.org/message-id/CANWCAZY7Cr-GdUhrCLoR4%2BJGLChTb0pQxq9ZPi1KTLs%2B_KDFqg%40mail.gmail.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Dilip Kumar
Дата:
Сообщение: Re: Adding facility for injection points (or probe points?) for more advanced tests
Следующее
От: "Drouvot, Bertrand"
Дата:
Сообщение: Add isCatalogRel in rmgrdesc