Re: Do we want a hashset type?

Поиск
Список
Период
Сортировка
От jian he
Тема Re: Do we want a hashset type?
Дата
Msg-id CACJufxE=XCn950YfxDhY_0cu=15znnYejMJ5_EpCLzh1OJqbTw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Do we want a hashset type?  ("Joel Jacobson" <joel@compiler.org>)
Ответы Re: Do we want a hashset type?  (jian he <jian.universality@gmail.com>)
Re: Do we want a hashset type?  ("Joel Jacobson" <joel@compiler.org>)
Список pgsql-hackers


On Thu, Jun 15, 2023 at 5:04 AM Joel Jacobson <joel@compiler.org> wrote:
On Wed, Jun 14, 2023, at 15:16, Tomas Vondra wrote:
> On 6/14/23 14:57, Joel Jacobson wrote:
>> Would it be feasible to teach the planner to utilize the internal hash table of
>> hashset directly? In the case of arrays, the hash table construction is an
...
> It's definitely something I'd leave out of v0, personally.

OK, thanks for guidance, I'll stay away from it.

I've been doing some preparatory work on this todo item:

> 3) support for other types (now it only works with int32)

I've renamed the type from "hashset" to "int4hashset",
and the SQL-functions are now prefixed with "int4"
when necessary. The overloaded functions with
int4hashset as input parameters don't need to be prefixed,
e.g. hashset_add(int4hashset, int).

Other changes since last update (4e60615):

* Support creation of empty hashset using '{}'::hashset
* Introduced a new function hashset_capacity() to return the current capacity
  of a hashset.
* Refactored hashset initialization:
  - Replaced hashset_init(int) with int4hashset() to initialize an empty hashset
    with zero capacity.
  - Added int4hashset_with_capacity(int) to initialize a hashset with
    a specified capacity.
* Improved README.md and testing

As a next step, I'm planning on adding int8 support.

Looks and sounds good?

/Joel
I am not sure the following results are correct.
with cte as (
    select hashset(x) as x
            ,hashset_capacity(hashset(x))
            ,hashset_count(hashset(x))
    from generate_series(1,10) g(x))
select *
        ,'|' as delim
        , hashset_add(x,11111::int)
        ,hashset_capacity(hashset_add(x,11111::int))
        ,hashset_count(hashset_add(x,11111::int))
from    cte \gx


results:  
-[ RECORD 1 ]----+-----------------------------
x                | {8,1,10,3,9,4,6,2,11111,5,7}
hashset_capacity | 64
hashset_count    | 10
delim            | |
hashset_add      | {8,1,10,3,9,4,6,2,11111,5,7}
hashset_capacity | 64
hashset_count    | 11


but:
with cte as(select '{1,2}'::int4hashset as x)   select x,hashset_add(x,3::int)  from cte;

returns
   x   | hashset_add
-------+-------------
 {1,2} | {3,1,2}
(1 row)
last simple query seems more sensible to me.


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nathan Bossart
Дата:
Сообщение: Re: allow granting CLUSTER, REFRESH MATERIALIZED VIEW, and REINDEX
Следующее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: Add a perl function in Cluster.pm to generate WAL