Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs
Дата
Msg-id 2999167.1711476963@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs  (Nathan Bossart <nathandbossart@gmail.com>)
Ответы Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs  (Nathan Bossart <nathandbossart@gmail.com>)
Список pgsql-hackers
Nathan Bossart <nathandbossart@gmail.com> writes:
> I spent some time trying to get some ballpark figures but have thus far
> been unsuccessful.  Even if I was able to get good numbers, I'm not sure
> how much they'd help us, as we'll still need to decide how much overhead we
> are willing to take in comparison to the linear search.  I don't think
> ~1000 is an unreasonable starting point, as it seems generally more likely
> that you will have many more roles to process at that point than if the
> threshold was, say, 100.  And if the threshold is too high (e.g., 10,000),
> this optimization will only kick in for the most extreme cases, so we'd
> likely be leaving a lot on the table.  But, I will be the first to admit
> that my reasoning here is pretty unscientific, and I'm open to suggestions
> for how to make it less so.

I did a little experimentation using the attached quick-hack C
function, and came to the conclusion that setting up the bloom filter
costs more or less as much as inserting 1000 or so OIDs the dumb way.
So we definitely want a threshold that's not much less than that.
For example, with ROLES_LIST_BLOOM_THRESHOLD = 100 I saw:

regression=# select drive_bloom(100, 10, 100000);
 drive_bloom 
-------------
 
(1 row)

Time: 319.931 ms
regression=# select drive_bloom(101, 10, 100000);
 drive_bloom 
-------------
 
(1 row)

Time: 319.385 ms
regression=# select drive_bloom(102, 10, 100000);
 drive_bloom 
-------------
 
(1 row)

Time: 9904.786 ms (00:09.905)

That's a pretty big jump in context.  With the threshold set to 1024,

regression=# select drive_bloom(1024, 10, 100000);
 drive_bloom 
-------------
 
(1 row)

Time: 14597.510 ms (00:14.598)
regression=# select drive_bloom(1025, 10, 100000);
 drive_bloom 
-------------
 
(1 row)

Time: 14589.197 ms (00:14.589)
regression=# select drive_bloom(1026, 10, 100000);
 drive_bloom 
-------------
 
(1 row)

Time: 25947.000 ms (00:25.947)
regression=# select drive_bloom(1027, 10, 100000);
 drive_bloom 
-------------
 
(1 row)

Time: 25399.718 ms (00:25.400)
regression=# select drive_bloom(2048, 10, 100000);
 drive_bloom 
-------------
 
(1 row)

Time: 33809.536 ms (00:33.810)

So I'm now content with choosing a threshold of 1000 or 1024 or so.

As for the bloom filter size, I see that bloom_create does

    bitset_bytes = Min(bloom_work_mem * UINT64CONST(1024), total_elems * 2);
    bitset_bytes = Max(1024 * 1024, bitset_bytes);

which means that any total_elems input less than 512K is disregarded
altogether.  So I'm not sold on your "ROLES_LIST_BLOOM_THRESHOLD * 10"
value.  Maybe it doesn't matter though.

I do not like, even a little bit, your use of a static variable to
hold the bloom filter pointer.  That code will misbehave horribly
if we throw an error partway through the role-accumulation loop;
the next call will try to carry on using the old filter, which would
be wrong even if it still existed which it likely won't.  It's not
that much worse notationally to keep it as a local variable, as I
did in the attached.

            regards, tom lane

/*

create function drive_bloom(num_oids int, dup_freq int, count int) returns void
strict volatile language c as '/path/to/drive_bloom.so';

\timing on

select drive_bloom(100, 0, 100000);

 */

#include "postgres.h"

#include "fmgr.h"
#include "lib/bloomfilter.h"
#include "miscadmin.h"
#include "tcop/tcopprot.h"
#include "utils/builtins.h"
#include "utils/memutils.h"

PG_MODULE_MAGIC;

#define ROLES_LIST_BLOOM_THRESHOLD 1024
#define ROLES_LIST_BLOOM_SIZE (1024 * 1024)


static inline List *
roles_list_append(List *roles_list, Oid role, bloom_filter **roles_list_bf)
{
    unsigned char *roleptr = (unsigned char *) &role;

    /*
     * If there is a previously-created Bloom filter, use it to determine
     * whether the role is missing from the list.  Otherwise, do an ordinary
     * linear search through the existing role list.
     */
    if ((*roles_list_bf &&
         bloom_lacks_element(*roles_list_bf, roleptr, sizeof(Oid))) ||
        !list_member_oid(roles_list, role))
    {
        /*
         * If the list is large, we take on the overhead of creating and
         * populating a Bloom filter to speed up future calls to this
         * function.
         */
        if (!*roles_list_bf &&
            list_length(roles_list) > ROLES_LIST_BLOOM_THRESHOLD)
        {
            *roles_list_bf = bloom_create(ROLES_LIST_BLOOM_SIZE,
                                         work_mem, 0);
            foreach_oid(roleid, roles_list)
                bloom_add_element(*roles_list_bf,
                                  (unsigned char *) &roleid,
                                  sizeof(Oid));
        }

        /*
         * Finally, add the role to the list and the Bloom filter, if it
         * exists.
         */
        roles_list = lappend_oid(roles_list, role);
        if (*roles_list_bf)
            bloom_add_element(*roles_list_bf, roleptr, sizeof(Oid));
    }

    return roles_list;
}

/*
 * drive_bloom(num_oids int, dup_freq int, count int) returns void
 *
 * num_oids: number of OIDs to de-duplicate
 * dup_freq: if > 0, every dup_freq'th OID is duplicated
 * count: overall repetition count; choose large enough to get reliable timing
 */
PG_FUNCTION_INFO_V1(drive_bloom);
Datum
drive_bloom(PG_FUNCTION_ARGS)
{
    int32        num_oids = PG_GETARG_INT32(0);
    int32        dup_freq = PG_GETARG_INT32(1);
    int32        count = PG_GETARG_INT32(2);
    MemoryContext mycontext;

    mycontext = AllocSetContextCreate(CurrentMemoryContext,
                                      "drive_bloom work cxt",
                                      ALLOCSET_DEFAULT_SIZES);

    while (count-- > 0)
    {
        List *roles_list = NIL;
        Oid nextrole = 1;
        bloom_filter *roles_list_bf = NULL;

        MemoryContext oldcontext;

        oldcontext = MemoryContextSwitchTo(mycontext);

        for (int i = 0; i < num_oids; i++)
        {
            roles_list = roles_list_append(roles_list, nextrole,
                                           &roles_list_bf);
            if (dup_freq > 0 && i % dup_freq == 0)
                roles_list = roles_list_append(roles_list, nextrole,
                                               &roles_list_bf);
            nextrole++;
        }

        if (roles_list_bf)
            bloom_free(roles_list_bf);

        MemoryContextSwitchTo(oldcontext);

        MemoryContextReset(mycontext);

        CHECK_FOR_INTERRUPTS();
    }

    PG_RETURN_VOID();
}

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: pg_combinebackup --copy-file-range
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: Slow GRANT ROLE on PostgreSQL 16 with thousands of ROLEs