Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?

Поиск
Список
Период
Сортировка
От Erik Aronesty
Тема Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?
Дата
Msg-id CAJowKg+BbnR4Wrhse__9RzJaaaYqCUHSjA9eqK3XR7ECub_SRw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?  (Erwin Brandstetter <brsaweda@gmail.com>)
Список pgsql-general
You can always tweak fnv for whatever bite-size or bit size you want.   Sometimes I know a little information about my data shape and make a custom fnv that only looks at the first half for the last half of a string, etc.

On Wed, Dec 11, 2019, 1:02 PM Erwin Brandstetter <brsaweda@gmail.com> wrote:
Thanks for the suggestion. Seems like a good assumption and I have been using hashtext() in the past. But I am uncertain whether it is the best option.

Guess Tom's warning in https://www.postgresql.org/message-id/9434.1568839177@sss.pgh.pa.us about portability only refers to hashtextextended() and friends not being there in Postgres 10 or older.

But why are none of these functions documented? Does the project still not ...

> want people to rely on them continuing to have exactly the current behavior.

I am not complaining, maybe just nobody did the work. But it's also mentioned in this old thread, that hastext() changed in the past. Is all of that outdated and we are welcome to use those functions for indexing?

Filtering with amprocnum = 2 gets functions producing bigint in Postgres 11 or later.  Not sure about the exact meaning of amprocnum, manual says "Support function number".

Remaining problem either way: no hash function returning bigint for Postgres 10.

Regards
Erwin

On Tue, Dec 10, 2019 at 11:13 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2019-12-10 at 22:11 +0100, Erwin Brandstetter wrote:
> I am looking for stable hash functions producing 8-byte or 4-byte hashes from long text values in Postgres 10 or later.
>
> [...]
>
> There is an old post from 2012 by Tom Lane suggesting that hashtext() and friends are not for users:
>
> https://www.postgresql.org/message-id/24463.1329854466%40sss.pgh.pa.us

Changing a hash function would corrupt hash indexes, wouldn't it?

So I'd expect these functions to be pretty stable:

SELECT amp.amproc
FROM pg_amproc AS amp
   JOIN pg_opfamily AS opf ON amp.amprocfamily = opf.oid
   JOIN pg_am ON opf.opfmethod = pg_am.oid
WHERE pg_am.amname = 'hash'
  AND amp.amprocnum = 1;

Or at least there would have to be a fat warning in the release notes
to reindex hash indexes.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

В списке pgsql-general по дате отправления:

Предыдущее
От: George Neuner
Дата:
Сообщение: Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?
Следующее
От: Ron
Дата:
Сообщение: Re: Is there any tool that provides Physical Backup plus PITR for asingle database ( Not the whole PG instance ) ?