Re: Do we want a hashset type?

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Do we want a hashset type?
Дата
Msg-id 257197c1-4d28-a8c1-c311-510cbc862e03@enterprisedb.com
обсуждение исходный текст
Ответ на Re: Do we want a hashset type?  ("Joel Jacobson" <joel@compiler.org>)
Ответы Re: Do we want a hashset type?  ("Joel Jacobson" <joel@compiler.org>)
Список pgsql-hackers

On 6/1/23 09:14, Joel Jacobson wrote:
> On Thu, Jun 1, 2023, at 09:02, Joel Jacobson wrote:
>> Here is an example using a real anonymised social network.
> 
> I realised the "found" column is not necessary in this particular example,
> since we only care about the friends at the exact depth level. Simplified query:
> 
> CREATE OR REPLACE VIEW friends_of_friends AS
> WITH RECURSIVE friends_of_friends AS (
>     SELECT 
>         ARRAY[5867::bigint] AS current,
>         0 AS depth
>     UNION ALL
>     SELECT
>         new_current,
>         friends_of_friends.depth + 1
>     FROM
>         friends_of_friends
>     CROSS JOIN LATERAL (
>         SELECT
>             array_agg(DISTINCT edges.to_node) AS new_current
>         FROM
>             edges
>         WHERE
>             from_node = ANY(friends_of_friends.current)
>     ) q
>     WHERE
>         friends_of_friends.depth < 3
> )
> SELECT
>     depth,
>     coalesce(array_length(current, 1), 0)
> FROM
>     friends_of_friends
> WHERE
>     depth = 3;
> ;
> 
> -- PostgreSQL 15.2:
> 
> EXPLAIN ANALYZE SELECT * FROM friends_of_friends;
>                                                                             QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  CTE Scan on friends_of_friends  (cost=2687.88..2688.58 rows=1 width=8) (actual time=2076.362..2076.454 rows=1
loops=1)
>    Filter: (depth = 3)
>    Rows Removed by Filter: 3
>    CTE friends_of_friends
>      ->  Recursive Union  (cost=0.00..2687.88 rows=31 width=36) (actual time=0.008..2075.073 rows=4 loops=1)
>            ->  Result  (cost=0.00..0.01 rows=1 width=36) (actual time=0.002..0.002 rows=1 loops=1)
>            ->  Subquery Scan on "*SELECT* 2"  (cost=89.44..268.75 rows=3 width=36) (actual time=518.613..518.622
rows=1loops=4)
 
>                  ->  Nested Loop  (cost=89.44..268.64 rows=3 width=36) (actual time=515.523..515.523 rows=1 loops=4)
>                        ->  WorkTable Scan on friends_of_friends friends_of_friends_1  (cost=0.00..0.22 rows=3
width=36)(actual time=0.001..0.001 rows=1 loops=4)
 
>                              Filter: (depth < 3)
>                              Rows Removed by Filter: 0
>                        ->  Aggregate  (cost=89.44..89.45 rows=1 width=32) (actual time=687.356..687.356 rows=1
loops=3)
>                              ->  Index Only Scan using edges_pkey on edges  (cost=0.56..83.96 rows=2191 width=4)
(actualtime=0.139..290.996 rows=3486910 loops=3)
 
>                                    Index Cond: (from_node = ANY (friends_of_friends_1.current))
>                                    Heap Fetches: 0
>  Planning Time: 0.557 ms
>  Execution Time: 2076.990 ms
> (17 rows)
> 

I've been thinking about this a bit on the way back from pgcon. Per CPU
profile it seems most of the job is actually spent on qsort, calculating
the array_agg(distinct) bit. I don't think we build

We could replace this part by a hash-based set aggregate, which would be
faster, but I doubt that may yield a massive improvement that'd change
the fundamental cost.

I forgot I did something like that a couple years back, implementing a
count_distinct() aggregate that was meant as a faster alternative to
count(distinct). And then I mostly abandoned it because the built-in
sort-based approach improved significantly - it was still slower in
cases, but the gap got small enough.

Anyway, I hacked together a trivial set backed by an open addressing
hash table:

  https://github.com/tvondra/hashset

It's super-basic, providing just some bare minimum of functions, but
hopefully good enough for experiments.

- hashset data type - hash table in varlena
- hashset_init - create new hashset
- hashset_add / hashset_contains - add value / check membership
- hashset_merge - merge two hashsets
- hashset_count - count elements
- hashset_to_array - return
- hashset(int) aggregate

This allows rewriting the recursive query like this:

    WITH RECURSIVE friends_of_friends AS (
        SELECT
            ARRAY[5867::bigint] AS current,
            0 AS depth
        UNION ALL
        SELECT
            new_current,
            friends_of_friends.depth + 1
        FROM
            friends_of_friends
        CROSS JOIN LATERAL (
            SELECT
                hashset_to_array(hashset(edges.to_node)) AS new_current
            FROM
                edges
            WHERE
                from_node = ANY(friends_of_friends.current)
        ) q
        WHERE
            friends_of_friends.depth < 3
    )
    SELECT
        depth,
        coalesce(array_length(current, 1), 0)
    FROM
        friends_of_friends
    WHERE
        depth = 3;

On my laptop cuts the timing roughly in half - which is nice, but as I
said I don't think it's not a fundamental speedup. The aggregate can be
also parallelized, but I don't think that changes much.

Furthermore, this has a number of drawbacks too - e.g. it can't spill
data to disk, which might be an issue on more complex queries / larger
data sets.

FWIW I wonder how representative this query is, considering it returns
~1M node IDs, i.e. about 10% of the whole set of node IDs. Seems a bit
on the high side.

I've also experimented with doing stuff from plpgsql procedure (that's
what the non-aggregate functions are about). I saw this mostly as a way
to do stuff that'd be hard to do in the recursive CTE, but it has a lot
of additional execution overhead due to plpgsql. Maybe it we had some
smart trick to calculate adjacent nodes we could have a SRF written in C
to get rid of the overhead.

Anyway, this leads me to the question what graph databases are doing for
these queries, if they're faster in answering such queries (by how
much?). I'm not very familiar with that stuff, but surely they do
something smart - precalculating the data, some special indexing,
duplicating the data in multiple places?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Sabino Mullane
Дата:
Сообщение: Re: Prevent psql \watch from running queries that return no rows
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Reload configuration more frequently in apply worker.