Обсуждение: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

Поиск
Список
Период
Сортировка

BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Tomas Vondra
Дата:
Hi,

while experimenting with BRIN indexes after a couple FOSDEM discussions,
I ran into the existing limitation that BRIN indexes don't handle array
scan keys. So BRIN indexes can be used for conditions like

    WHERE a IN (1,2,3,4,5)

but we essentially treat the values as individual scan keys, and for
each one we scan the BRIN index and build/update the bitmap. Which for
large indexes may be fairly expensive - the cost is proportional to the
number of values, so if building the bitmap for 1 value takes 10ms, for
100 values it'll take ~1000ms.

It's not hard to construct cases like this (e.g. when using indexes with
small pages_per_range values) etc. Of course, if the query does a lot of
other expensive stuff, this cost may be insignificant.

I'm not sure how often people do queries with such conditions. But I've
been experimenting with features that'd build such paths, so I took a
stab at a PoC, which can significantly reduce the time needed to build
the bitmaps. And there's a couple more interesting opportunities.


0001 - Support SK_SEARCHARRAY in BRIN minmax
--------------------------------------------
The 0001 part does a "naive" SK_SEARCHARRAY implementation for minmax.
It simply sets amsearcharray=true and then tweaks the consistent
function to handle both the scalar and array scan keys.

This is obviously rather inefficient, because the array is searched
linearly. So yes, we don't walk the index repeatedly, but we have to
compare each range to (almost-)all values.


0002 - Sort the array in brinrescan() and do binsearch
------------------------------------------------------
There's a simple way to optimize the naive approach by sorting the array
and then searching in this array. If the array is sorted, we can search
for the first value >= minvalue, and see if that is consistent (i.e. if
it's <= maxval).

In my experiments this cuts the time needed to build the bitmap for
array to pretty much the same as for a single value.

I think this is similar to the preprocessing of scan keys in b-tree, so
brinrescan() is a natural way to do the sort. The problem however is
where to store the result.

Ideally, we'd store it in BrinOpaque (just like BTScanOpaque in btree),
but the problem is we don't pass that to the consistent functions. Those
only get the ScanKeys and stuff extracted from BrinOpaque.

We might add a parameter to the "consistent" function, but that
conflicts with not wanting to break existing extensions implementing
their own BRIN indexes. We allow the opclasses to define "consistent"
with either 4 or 5 arguments. Adding an argument would mean 5 or 6
arguments, but because of the backwards compatibility we'd need to
support existing opclasses, and 5 is ambiguous :-/

In hindsight, I would probably not chose supporting both 4 and 5
arguments again. It makes it harder for us to maintain the code to make
life easier for extensions, but I'm not aware of any out-of-core BRIN
opclasses anyway. So I'd probably just change the API, it's pretty easy
to update existing extensions.

This patch however does a much simpler thing - it just replaces the
array in the SK_SEARCHARRAY scan key with a sorted one. That works for
for minmax, but not for bloom/inclusion, because those are not based on
sorting. And the ArrayType is not great for minmax either, because it
means we need to deconstruct it again and again, for each range. It'd be
much better to deconstruct the array once.

I'll get back to this ...


0003 - Support SK_SEARCHARRAY in BRIN inclusion
-----------------------------------------------
Trivial modification to support array scan keys, can't benefit from
sorting the array.


0004 - Support SK_SEARCHARRAY in BRIN bloom
-------------------------------------------
Trivial modification to support array scan keys, can't benefit from
sorted array either.

But we might "preprocess" the keys in a different way - bloom needs to
calculate two hashes per key, and at the moment it happens again and
again for each range. So if you have 1M ranges, and SK_SEARCHARRAY query
with 100 values, we'll do 100M calls to PROCNUM_HASH and 200M calls to
hash_uint32_extended(). And our hash functions are pretty expensive,
certainly compared to the fast functions often used for bloom filters.

So the preprocessing might actually calculate the hash functions once,
and then only reuse those in the "consistent" function.

0005 is a dirty PoC illustrating the benefit of caching the hashes.

Unfortunately, this complicates things, because it means:

* The scan key preprocessing is not universal for all BRIN opclasses,
  because some opclasses, i.e. each BRIN opclass might have optional
  BRIN_PROCNUM_PREPROCESS which would preprocess the keys the way the
  opclass would like.

* We can simply replace the array in the scan key the way minmax does
  that with the sorted array, because the data type is not the same
  (hashes are uint64).


When I started to write this e-mail I thought there's pretty much just
one way to move this forward:

1) Add a BRIN_PROCNUM_PREPROCESS to BRIN, doing the preprocessing (if
   not defined, the key is not preprocessed.

2) Store the preprocessed keys in BrinOpaque.

3) Modify the BRIN API to allow passing the preprocessed keys.

As mentioned earlier, I'm not sure how difficult would it be to maintain
backwards compatibility, considering the number of arguments of the
consistent function would be ambiguous.

Maybe the existence of BRIN_PROCNUM_PREPROCESS would be enough to decide
this - if it's decided, no keys are preprocessed (and the opclass would
not support SK_SEARCHARRAY).


But now I realize maybe we can do without adding parameters to the
"consistent" function. We might stash "preprocessed" scankeys into
BrinOpaque, and pass them to the consistent function instead of the
"original" scan keys (at least when the BRIN_PROCNUM_PREPROCESS). In
fact, I see ScanKey even allows AM-specific flags, maybe it'd be useful
to to mark the preprocessed keys.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Tomas Vondra
Дата:
Hi,

Attached is a patch series adopting the idea of scan key preprocessing
in brinrescan(), and producing scan keys. It turns out to work pretty
nicely, and it allows different opclasses doing different things:

- minmax / minmax-multi: sort the array values (leave scalars alone)
- inclusion: no preprocessing
- bloom: precalculate hash values

The _consistent functions are modified to leverage the preprocessed
keys. I wonder if it should check the existence of the (optional)
procedure, and fallback to the non-optimized search if not defined.

That would allow opclasses (e.g. from extensions) to keep using the
built-in consistent function without tweaking the definition to also
have the preprocess function. But that seems like a rather minor issue,
especially because the number of external opclasses is tiny and updating
the definition to also reference the preprocess function is trivial. I
don't think it's worth the extra code complexity.

0001 and 0002 are minor code cleanup in the opclasses introduced in PG
13. There's a couple places assigning boolean values to Datum variables,
and misleading comments.

0003 is a minor refactoring making the Bloom filter size calculation
easier to reuse.

0004 introduces the optional "preprocess" opclass procedure, and calls
it for keys from brinrescan().

0005-0008 add the preprocess procedure to the various BRIN types, and
adjust the consistent procedures accordingly.


Attached is a python script I used to measure this. It builds a table
with 10M rows, with sequential but slightly randomized (value may move
within the 1% of table), and minmax/bloom indexes. The table has ~500MB,
the indexes are using pages_per_range=1 (tiny, but simulates large table
with regular page ranges).

And then the script queries the table with different number of random
values in the "IN (...)" clause, and measures query duration (in ms).

The results look like this:

                           int            text
  index   values   master  patched    master  patched     int   text
  ------------------------------------------------------------------
  minmax       1        7        7        27       25    100%    92%
              10       66       15       277       70     23%    25%
              20      132       16       558       85     12%    15%
              50      331       21      1398      102      7%     7%
             100      663       29      2787      118      4%     4%
             500     3312       81     13964      198      2%     1%
  ------------------------------------------------------------------
  bloom        1       30       27        23       18     92%    77%
              10      302      208       231       35     69%    15%
              20      585      381       463       54     65%    12%
              50     1299      761      1159      111     59%    10%
             100     2194     1099      2312      204     50%     9%
             500     6850     1228     11559      918     18%     8%
  ------------------------------------------------------------------

With minmax, consider for example queries with 20 values, which used to
take ~130ms, but with the patch this drops to 16ms (~23%). And the
improvement is even more significant for larger number of values. For
text data the results are pretty comparable.

With bloom indexes, the improvements are proportional to how expensive
the hash function is (for the data type). For int the hash is fairly
cheap, so the improvement is rather moderate (but visible). For text,
the improvements are way more significant - for 10 values the duration
is reduced by a whopping 85%.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Tomas Vondra
Дата:
cfbot identified a couple issues in the pathes:

1) not handling NULLs correctly (or rather at all). There was a FIXME,
so I took this as a sign it's time to finally address that.

2) minmax-multi did not fully adopt the preprocessed values in the
second part of the _consistent function

The patches also add a bunch of regression tests to improve coverage.


While adding those, I ran into an interesting behavior with BRIN bloom
indexes. If you have such index on a bigint column, then this won't use
the index:

  SELECT * FROM t WHERE b = 82;

unless you cast the constant to bigint like this:

  SELECT * FROM t WHERE b = 82::bigint;

I vaguely remember dealing with this while working on the bloom indexes,
and concluding this is OK. But what's interesting is that with multiple
values in the IN clause it works and this will use the index:

  SELECT * FROM t WHERE b IN (82, 83);

That's a bit surprising.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Tomas Vondra
Дата:

Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Heikki Linnakangas
Дата:
I had a quick look at just the preliminary cleanup patches:

> 0001-BRIN-bloom-cleanup-20230218.patch

Looks good to me

> 0002-BRIN-minmax-multi-cleanup-20230218.patch

Looks good, although it would feel more natural to me to do it the other 
way round, and define 'matches' as 'bool matches', and use DatumGetBool.

Not new with this patch, but I find the 'matches' and 'matching' 
variables a bit strange. Wouldn't it be simpler to have just one variable?

> 0003-Introduce-bloom_filter_size-20230218.patch

Looks good

> 0004-Add-minmax-multi-inequality-tests-20230218.patch

Looks good

> +SELECT i/5 + mod(911 * i + 483, 25),
> +       i/10 + mod(751 * i + 221, 41)

Peculiar formulas. Was there a particular reason for these values?

- Heikki




Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Tomas Vondra
Дата:
On 2/24/23 22:07, Heikki Linnakangas wrote:
> I had a quick look at just the preliminary cleanup patches:
> 
>> 0001-BRIN-bloom-cleanup-20230218.patch
> 
> Looks good to me
> 
>> 0002-BRIN-minmax-multi-cleanup-20230218.patch
> 
> Looks good, although it would feel more natural to me to do it the other
> way round, and define 'matches' as 'bool matches', and use DatumGetBool.
> 

Yeah, probably. I was trying to only do the minimal change because of
(maybe) backpatching this.

> Not new with this patch, but I find the 'matches' and 'matching'
> variables a bit strange. Wouldn't it be simpler to have just one variable?
> 

True. I don't recall why we did it this way.

>> 0003-Introduce-bloom_filter_size-20230218.patch
> 
> Looks good
> 
>> 0004-Add-minmax-multi-inequality-tests-20230218.patch
> 
> Looks good
> 
>> +SELECT i/5 + mod(911 * i + 483, 25),
>> +       i/10 + mod(751 * i + 221, 41)
> 
> Peculiar formulas. Was there a particular reason for these values?
> 

No, not really. I simply wanted a random-looking data, but reproducible
and deterministic. And linear congruential generator is a simple way to
do that. I just picked a couple co-prime numbers, and that's it.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Tomas Vondra
Дата:
Here's a rebased version of this patch series, no other changes.

On 2/25/23 12:45, Tomas Vondra wrote:
> On 2/24/23 22:07, Heikki Linnakangas wrote:
>> I had a quick look at just the preliminary cleanup patches:
>>
>>> 0001-BRIN-bloom-cleanup-20230218.patch
>>
>> Looks good to me
>>
>>> 0002-BRIN-minmax-multi-cleanup-20230218.patch
>>
>> Looks good, although it would feel more natural to me to do it the other
>> way round, and define 'matches' as 'bool matches', and use DatumGetBool.
>>
> 
> Yeah, probably. I was trying to only do the minimal change because of
> (maybe) backpatching this.
> 

I haven't changed this.

Heikki, do you think these cleanup parts should be backpatched? If yes,
do you still think it should be reworked to do it the other way, or like
I did it do minimize the change?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Tomas Vondra
Дата:
Here's an updated version of the patch series.

I've polished and pushed the first three patches with cleanup, tests to
improve test coverage and so on. I chose not to backpatch those - I
planned to do that to make future backpatches simpler, but the changes
ended up less disruptive than expected.

The remaining patches are just about adding SK_SEARCHARRAY to BRIN.

0001 - adds the optional preprocess procedure, calls it from brinrescan

0002 to 0005 - adds the support to the existing BRIN opclasses

The main open question I have is what exactly does it mean that the
procedure is optional. In particular, should it be supported to have a
BRIN opclass without the "preprocess" procedure but using the other
built-in support procedures?

For example, imagine you have a custom BRIN opclass in an extension (for
a custom data type or something). This does not need to implement any
procedures, it can just call the existing built-in ones. Of course, this
won't get the "preprocess" procedure automatically.

Should we support such opclasses or should we force the extension to be
updated by adding a preprocess procedure? I'd say "optional" means we
should support (otherwise it'd not really optional).

The reason why this matters is that "amsearcharray" is AM-level flag,
but the support procedure is defined by the opclass. So the consistent
function needs to handle SK_SEARCHARRAY keys both with and without
preprocessing.

That's mostly what I did for all existing BRIN opclasses (it's a bit
confusing that opclass may refer to both the "generic" minmax or the
opclass defined for a particular data type). All the opclasses now
handle three cases:

1) scalar keys (just like before, with amsearcharray=fase)

2) array keys with preprocessing (sorted array, array of hashes, ...)

3) array keys without preprocessing (for compatibility with old
   opclasses missing the optional preprocess procedure)

The current code is a bit ugly, because it duplicates a bunch of code,
because the option (3) mostly does (1) in a loop. I'm confident this can
be reduced by refactoring and reusing some of the "shared" code.

The question is if my interpretation of what "optional" procedure means
is reasonable. Thoughts?

The other thing is how to test this "compatibility" code. I assume we
want to have the procedure for all built-in opclasses, so that won't
exercise it. I did test it by temporarily removing the procedure from a
couple pg_amproc.dat entries. I guess creating a custom opclass in the
regression test is the right solution.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения

Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Heikki Linnakangas
Дата:
On 02/07/2023 19:09, Tomas Vondra wrote:
> Here's an updated version of the patch series.
> 
> I've polished and pushed the first three patches with cleanup, tests to
> improve test coverage and so on. I chose not to backpatch those - I
> planned to do that to make future backpatches simpler, but the changes
> ended up less disruptive than expected.
> 
> The remaining patches are just about adding SK_SEARCHARRAY to BRIN.
> 
> 0001 - adds the optional preprocess procedure, calls it from brinrescan
> 
> 0002 to 0005 - adds the support to the existing BRIN opclasses

Could you implement this completely in the consistent-function, by 
caching the sorted array in fn_extra, without adding the new preprocess 
procedure? On first call, when fn_extra == NULL, sort the array and 
stash it in fn_extra.

I don't think that works, because fn_extra isn't reset when the scan 
keys change on rescan. We could reset it, and document that you can use 
fn_extra for per-scankey caching. There's some precedence for not 
resetting it though, see commit d22a09dc70f. But we could provide an 
opaque per-scankey scratch space like that somewhere else. In BrinDesc, 
perhaps.

The new preprocess support function feels a bit too inflexible to me. 
True, you can store whatever you want in the ScanKey that it returns, 
but since that's the case, why not just make it void * ?It seems that 
the constraint here was that you need to pass a ScanKey to the 
consistent function, because the consistent function's signature is what 
it is. But we can change the signature, if we introduce a new support 
amproc number for it.

> The main open question I have is what exactly does it mean that the
> procedure is optional. In particular, should it be supported to have a
> BRIN opclass without the "preprocess" procedure but using the other
> built-in support procedures?
>
> For example, imagine you have a custom BRIN opclass in an extension (for
> a custom data type or something). This does not need to implement any
> procedures, it can just call the existing built-in ones. Of course, this
> won't get the "preprocess" procedure automatically.
> 
> Should we support such opclasses or should we force the extension to be
> updated by adding a preprocess procedure? I'd say "optional" means we
> should support (otherwise it'd not really optional).
> 
> The reason why this matters is that "amsearcharray" is AM-level flag,
> but the support procedure is defined by the opclass. So the consistent
> function needs to handle SK_SEARCHARRAY keys both with and without
> preprocessing.
> 
> That's mostly what I did for all existing BRIN opclasses (it's a bit
> confusing that opclass may refer to both the "generic" minmax or the
> opclass defined for a particular data type). All the opclasses now
> handle three cases:
> 
> 1) scalar keys (just like before, with amsearcharray=fase)
> 
> 2) array keys with preprocessing (sorted array, array of hashes, ...)
> 
> 3) array keys without preprocessing (for compatibility with old
>     opclasses missing the optional preprocess procedure)
> 
> The current code is a bit ugly, because it duplicates a bunch of code,
> because the option (3) mostly does (1) in a loop. I'm confident this can
> be reduced by refactoring and reusing some of the "shared" code.
> 
> The question is if my interpretation of what "optional" procedure means
> is reasonable. Thoughts?
> 
> The other thing is how to test this "compatibility" code. I assume we
> want to have the procedure for all built-in opclasses, so that won't
> exercise it. I did test it by temporarily removing the procedure from a
> couple pg_amproc.dat entries. I guess creating a custom opclass in the
> regression test is the right solution.

It would be unpleasant to force all BRIN opclasses to immediately 
implement the searcharray-logic. If we don't want to do that, we need to 
implement generic SK_SEARCHARRAY handling in BRIN AM itself. That would 
be pretty easy, right? Just call the regular consistent function for 
every element in the array.

If an opclass wants to provide a faster/better implementation, it can 
provide a new consistent support procedure that supports that. Let's 
assign a new amproc number for new-style consistent function, which must 
support SK_SEARCHARRAY, and pass it some scratch space where it can 
cache whatever per-scankey data. Because it gets a new amproc number, we 
can define the arguments as we wish. We can pass a pointer to the 
per-scankey scratch space as a new argument, for example.

We did this backwards-compatibility dance with the 3/4-argument variants 
of the current consistent functions. But I think assigning a whole new 
procedure number is better than looking at the number of arguments.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Tomas Vondra
Дата:
On 7/8/23 23:57, Heikki Linnakangas wrote:
> On 02/07/2023 19:09, Tomas Vondra wrote:
>> Here's an updated version of the patch series.
>>
>> I've polished and pushed the first three patches with cleanup, tests to
>> improve test coverage and so on. I chose not to backpatch those - I
>> planned to do that to make future backpatches simpler, but the changes
>> ended up less disruptive than expected.
>>
>> The remaining patches are just about adding SK_SEARCHARRAY to BRIN.
>>
>> 0001 - adds the optional preprocess procedure, calls it from brinrescan
>>
>> 0002 to 0005 - adds the support to the existing BRIN opclasses
> 
> Could you implement this completely in the consistent-function, by
> caching the sorted array in fn_extra, without adding the new preprocess
> procedure? On first call, when fn_extra == NULL, sort the array and
> stash it in fn_extra.
> 
> I don't think that works, because fn_extra isn't reset when the scan
> keys change on rescan. We could reset it, and document that you can use
> fn_extra for per-scankey caching. There's some precedence for not
> resetting it though, see commit d22a09dc70f. But we could provide an
> opaque per-scankey scratch space like that somewhere else. In BrinDesc,
> perhaps.
> 

Hmm, yeah. BrinDesc seems like a good place for such scratch space ...

And it's seem to alleviate most of the compatibility issues, because
it'd make the preprocessing a responsibility of the consistent function,
instead of doing it in a separate optional procedure (and having to deal
with cases when it does not exist). If would not even need a separate
procnum.

> The new preprocess support function feels a bit too inflexible to me.
> True, you can store whatever you want in the ScanKey that it returns,
> but since that's the case, why not just make it void * ?It seems that
> the constraint here was that you need to pass a ScanKey to the
> consistent function, because the consistent function's signature is what
> it is. But we can change the signature, if we introduce a new support
> amproc number for it.
> 

Now sure I follow - what should be made (void *)? Oh, you mean not
passing the preprocessed array as a scan key at all, and instead passing
it as a new (void*) parameter to the (new) consistent function?

Yeah, I was trying to stick to the existing signature of the consistent
function, to minimize the necessary changes.

>> The main open question I have is what exactly does it mean that the
>> procedure is optional. In particular, should it be supported to have a
>> BRIN opclass without the "preprocess" procedure but using the other
>> built-in support procedures?
>>
>> For example, imagine you have a custom BRIN opclass in an extension (for
>> a custom data type or something). This does not need to implement any
>> procedures, it can just call the existing built-in ones. Of course, this
>> won't get the "preprocess" procedure automatically.
>>
>> Should we support such opclasses or should we force the extension to be
>> updated by adding a preprocess procedure? I'd say "optional" means we
>> should support (otherwise it'd not really optional).
>>
>> The reason why this matters is that "amsearcharray" is AM-level flag,
>> but the support procedure is defined by the opclass. So the consistent
>> function needs to handle SK_SEARCHARRAY keys both with and without
>> preprocessing.
>>
>> That's mostly what I did for all existing BRIN opclasses (it's a bit
>> confusing that opclass may refer to both the "generic" minmax or the
>> opclass defined for a particular data type). All the opclasses now
>> handle three cases:
>>
>> 1) scalar keys (just like before, with amsearcharray=fase)
>>
>> 2) array keys with preprocessing (sorted array, array of hashes, ...)
>>
>> 3) array keys without preprocessing (for compatibility with old
>>     opclasses missing the optional preprocess procedure)
>>
>> The current code is a bit ugly, because it duplicates a bunch of code,
>> because the option (3) mostly does (1) in a loop. I'm confident this can
>> be reduced by refactoring and reusing some of the "shared" code.
>>
>> The question is if my interpretation of what "optional" procedure means
>> is reasonable. Thoughts?
>>
>> The other thing is how to test this "compatibility" code. I assume we
>> want to have the procedure for all built-in opclasses, so that won't
>> exercise it. I did test it by temporarily removing the procedure from a
>> couple pg_amproc.dat entries. I guess creating a custom opclass in the
>> regression test is the right solution.
> 
> It would be unpleasant to force all BRIN opclasses to immediately
> implement the searcharray-logic. If we don't want to do that, we need to
> implement generic SK_SEARCHARRAY handling in BRIN AM itself. That would
> be pretty easy, right? Just call the regular consistent function for
> every element in the array.
> 

True, although the question is how many out-of-core opclasses are there.
My impression is the number is pretty close to 0, in which case we're
making ourselves to jump through all kinds of hoops, making the code
more complex, with almost no benefit in the end.

> If an opclass wants to provide a faster/better implementation, it can
> provide a new consistent support procedure that supports that. Let's
> assign a new amproc number for new-style consistent function, which must
> support SK_SEARCHARRAY, and pass it some scratch space where it can
> cache whatever per-scankey data. Because it gets a new amproc number, we
> can define the arguments as we wish. We can pass a pointer to the
> per-scankey scratch space as a new argument, for example.
> 
> We did this backwards-compatibility dance with the 3/4-argument variants
> of the current consistent functions. But I think assigning a whole new
> procedure number is better than looking at the number of arguments.
> 

I actually somewhat hate the 3/4-argument dance, and I'm opposed to
doing that sort of thing again. First, I'm not quite convinced it's
worth the effort to jump through this hoop (I recall this being one of
the headaches when fixing the BRIN NULL handling). Second, it can only
be done once - imagine we now need to add a new optional parameter.
Presumably, we'd need to keep the existing 3/4 variants, and add new 4/5
variants. At which point 4 is ambiguous.

Yes, my previous message was mostly about backwards compatibility, and
this may seem a bit like an argument against it. But that message was
more a question "If we do this, is it actually backwards compatible the
way we want/need?")

Anyway, I think the BrinDesc scratch space is a neat idea, I'll try
doing it that way and report back in a couple days.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Heikki Linnakangas
Дата:
On 09/07/2023 19:16, Tomas Vondra wrote:
> On 7/8/23 23:57, Heikki Linnakangas wrote:
>> The new preprocess support function feels a bit too inflexible to me.
>> True, you can store whatever you want in the ScanKey that it returns,
>> but since that's the case, why not just make it void * ? It seems that
>> the constraint here was that you need to pass a ScanKey to the
>> consistent function, because the consistent function's signature is what
>> it is. But we can change the signature, if we introduce a new support
>> amproc number for it.
> 
> Now sure I follow - what should be made (void *)? Oh, you mean not
> passing the preprocessed array as a scan key at all, and instead passing
> it as a new (void*) parameter to the (new) consistent function?

Right.

>> It would be unpleasant to force all BRIN opclasses to immediately
>> implement the searcharray-logic. If we don't want to do that, we need to
>> implement generic SK_SEARCHARRAY handling in BRIN AM itself. That would
>> be pretty easy, right? Just call the regular consistent function for
>> every element in the array.
> 
> True, although the question is how many out-of-core opclasses are there.
> My impression is the number is pretty close to 0, in which case we're
> making ourselves to jump through all kinds of hoops, making the code
> more complex, with almost no benefit in the end.

Perhaps. How many of the opclasses can do something smart with 
SEARCHARRAY? If the answer is "all" or "almost all", then it seems 
reasonable to just require them all to handle it. If the answer is 
"some", then it would still be nice to provide a naive default 
implementation in the AM itself. Otherwise all the opclasses need to 
include a bunch of boilerplate to support SEARCHARRAY.

>> If an opclass wants to provide a faster/better implementation, it can
>> provide a new consistent support procedure that supports that. Let's
>> assign a new amproc number for new-style consistent function, which must
>> support SK_SEARCHARRAY, and pass it some scratch space where it can
>> cache whatever per-scankey data. Because it gets a new amproc number, we
>> can define the arguments as we wish. We can pass a pointer to the
>> per-scankey scratch space as a new argument, for example.
>>
>> We did this backwards-compatibility dance with the 3/4-argument variants
>> of the current consistent functions. But I think assigning a whole new
>> procedure number is better than looking at the number of arguments.
> 
> I actually somewhat hate the 3/4-argument dance, and I'm opposed to
> doing that sort of thing again. First, I'm not quite convinced it's
> worth the effort to jump through this hoop (I recall this being one of
> the headaches when fixing the BRIN NULL handling). Second, it can only
> be done once - imagine we now need to add a new optional parameter.
> Presumably, we'd need to keep the existing 3/4 variants, and add new 4/5
> variants. At which point 4 is ambiguous.

My point is that we should assign a new amproc number to distinguish the 
new variant, instead of looking at the number of arguments. That way 
it's not ambiguous, and you can define whatever arguments you want for 
the new variant.

Yet another idea is to introduce a new amproc for a consistent function 
that *only* handles the SEARCHARRAY case, and keep the old consistent 
function as it is for the scalars. So every opclass would need to 
implement the current consistent function, just like today. But if an 
opclass wants to support SEARCHARRAY, it could optionally also provide 
an "consistent_array" function.

> Yes, my previous message was mostly about backwards compatibility, and
> this may seem a bit like an argument against it. But that message was
> more a question "If we do this, is it actually backwards compatible the
> way we want/need?")
> 
> Anyway, I think the BrinDesc scratch space is a neat idea, I'll try
> doing it that way and report back in a couple days.

Cool. In 0005-Support-SK_SEARCHARRAY-in-BRIN-bloom-20230702.patch, you 
used the preprocess function to pre-calculate the scankey's hash, even 
for scalars. You could use the scratch space in BrinDesc for that, 
before doing anything with SEARCHARRAYs.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Tomas Vondra
Дата:
On 7/9/23 20:05, Heikki Linnakangas wrote:
> On 09/07/2023 19:16, Tomas Vondra wrote:
>> On 7/8/23 23:57, Heikki Linnakangas wrote:
>>> The new preprocess support function feels a bit too inflexible to me.
>>> True, you can store whatever you want in the ScanKey that it returns,
>>> but since that's the case, why not just make it void * ? It seems that
>>> the constraint here was that you need to pass a ScanKey to the
>>> consistent function, because the consistent function's signature is what
>>> it is. But we can change the signature, if we introduce a new support
>>> amproc number for it.
>>
>> Now sure I follow - what should be made (void *)? Oh, you mean not
>> passing the preprocessed array as a scan key at all, and instead passing
>> it as a new (void*) parameter to the (new) consistent function?
> 
> Right.
> 
>>> It would be unpleasant to force all BRIN opclasses to immediately
>>> implement the searcharray-logic. If we don't want to do that, we need to
>>> implement generic SK_SEARCHARRAY handling in BRIN AM itself. That would
>>> be pretty easy, right? Just call the regular consistent function for
>>> every element in the array.
>>
>> True, although the question is how many out-of-core opclasses are there.
>> My impression is the number is pretty close to 0, in which case we're
>> making ourselves to jump through all kinds of hoops, making the code
>> more complex, with almost no benefit in the end.
> 
> Perhaps. How many of the opclasses can do something smart with
> SEARCHARRAY? If the answer is "all" or "almost all", then it seems
> reasonable to just require them all to handle it. If the answer is
> "some", then it would still be nice to provide a naive default
> implementation in the AM itself. Otherwise all the opclasses need to
> include a bunch of boilerplate to support SEARCHARRAY.
> 

For the built-in, I think all can do something smart.

For external, hard to say - my guess is they could do something
interesting too.


>>> If an opclass wants to provide a faster/better implementation, it can
>>> provide a new consistent support procedure that supports that. Let's
>>> assign a new amproc number for new-style consistent function, which must
>>> support SK_SEARCHARRAY, and pass it some scratch space where it can
>>> cache whatever per-scankey data. Because it gets a new amproc number, we
>>> can define the arguments as we wish. We can pass a pointer to the
>>> per-scankey scratch space as a new argument, for example.
>>>
>>> We did this backwards-compatibility dance with the 3/4-argument variants
>>> of the current consistent functions. But I think assigning a whole new
>>> procedure number is better than looking at the number of arguments.
>>
>> I actually somewhat hate the 3/4-argument dance, and I'm opposed to
>> doing that sort of thing again. First, I'm not quite convinced it's
>> worth the effort to jump through this hoop (I recall this being one of
>> the headaches when fixing the BRIN NULL handling). Second, it can only
>> be done once - imagine we now need to add a new optional parameter.
>> Presumably, we'd need to keep the existing 3/4 variants, and add new 4/5
>> variants. At which point 4 is ambiguous.
> 
> My point is that we should assign a new amproc number to distinguish the
> new variant, instead of looking at the number of arguments. That way
> it's not ambiguous, and you can define whatever arguments you want for
> the new variant.
> 

Right, I agree.

> Yet another idea is to introduce a new amproc for a consistent function
> that *only* handles the SEARCHARRAY case, and keep the old consistent
> function as it is for the scalars. So every opclass would need to
> implement the current consistent function, just like today. But if an
> opclass wants to support SEARCHARRAY, it could optionally also provide
> an "consistent_array" function.
> 

Hmm, we probably need to do something like this anyway. Because even
with the scratch space in BrinDesc, we still need to track whether the
opclass can process arrays or not. And if it can't we need to emulate it
in brin.c.


>> Yes, my previous message was mostly about backwards compatibility, and
>> this may seem a bit like an argument against it. But that message was
>> more a question "If we do this, is it actually backwards compatible the
>> way we want/need?")
>>
>> Anyway, I think the BrinDesc scratch space is a neat idea, I'll try
>> doing it that way and report back in a couple days.
> 
> Cool. In 0005-Support-SK_SEARCHARRAY-in-BRIN-bloom-20230702.patch, you
> used the preprocess function to pre-calculate the scankey's hash, even
> for scalars. You could use the scratch space in BrinDesc for that,
> before doing anything with SEARCHARRAYs.
> 

Yeah, that's a good idea.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Tomas Vondra
Дата:
On 7/9/23 23:44, Tomas Vondra wrote:
> ...
>>> Yes, my previous message was mostly about backwards compatibility, and
>>> this may seem a bit like an argument against it. But that message was
>>> more a question "If we do this, is it actually backwards compatible the
>>> way we want/need?")
>>>
>>> Anyway, I think the BrinDesc scratch space is a neat idea, I'll try
>>> doing it that way and report back in a couple days.
>>
>> Cool. In 0005-Support-SK_SEARCHARRAY-in-BRIN-bloom-20230702.patch, you
>> used the preprocess function to pre-calculate the scankey's hash, even
>> for scalars. You could use the scratch space in BrinDesc for that,
>> before doing anything with SEARCHARRAYs.
>>
> 
> Yeah, that's a good idea.
> 

I started looking at this (the scratch space in BrinDesc), and it's not
as straightforward. The trouble is BrinDesc is "per attribute" but the
scratch space is "per scankey" (because we'd like to sort values from
the scankey array).

With the "new" consistent functions (that get all scan keys at once)
this probably is not an issue, because we know which scan key we're
processing and so we can map it to the scratch space. But with the old
consistent function that's not the case. Maybe we should support this
only with the "new" consistent function variant?

This would however conflict with the idea to have a separate consistent
function for arrays, which "splits" the scankeys into multiple groups
again. There could be multiple SAOP scan keys, and then what?

I wonder if the scratch space should be in the ScanKey instead?


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
vignesh C
Дата:
On Fri, 14 Jul 2023 at 20:17, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 7/9/23 23:44, Tomas Vondra wrote:
> > ...
> >>> Yes, my previous message was mostly about backwards compatibility, and
> >>> this may seem a bit like an argument against it. But that message was
> >>> more a question "If we do this, is it actually backwards compatible the
> >>> way we want/need?")
> >>>
> >>> Anyway, I think the BrinDesc scratch space is a neat idea, I'll try
> >>> doing it that way and report back in a couple days.
> >>
> >> Cool. In 0005-Support-SK_SEARCHARRAY-in-BRIN-bloom-20230702.patch, you
> >> used the preprocess function to pre-calculate the scankey's hash, even
> >> for scalars. You could use the scratch space in BrinDesc for that,
> >> before doing anything with SEARCHARRAYs.
> >>
> >
> > Yeah, that's a good idea.
> >
>
> I started looking at this (the scratch space in BrinDesc), and it's not
> as straightforward. The trouble is BrinDesc is "per attribute" but the
> scratch space is "per scankey" (because we'd like to sort values from
> the scankey array).
>
> With the "new" consistent functions (that get all scan keys at once)
> this probably is not an issue, because we know which scan key we're
> processing and so we can map it to the scratch space. But with the old
> consistent function that's not the case. Maybe we should support this
> only with the "new" consistent function variant?
>
> This would however conflict with the idea to have a separate consistent
> function for arrays, which "splits" the scankeys into multiple groups
> again. There could be multiple SAOP scan keys, and then what?
>
> I wonder if the scratch space should be in the ScanKey instead?

Are we planning to post an updated patch for this? If the interest has
gone down and if there are no plans to handle this I'm thinking of
returning this commitfest entry in this commitfest and can be opened
when there is more interest.

Regards,
Vignesh



Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
Tomas Vondra
Дата:
On 1/14/24 12:18, vignesh C wrote:
> On Fri, 14 Jul 2023 at 20:17, Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> On 7/9/23 23:44, Tomas Vondra wrote:
>>> ...
>>>>> Yes, my previous message was mostly about backwards compatibility, and
>>>>> this may seem a bit like an argument against it. But that message was
>>>>> more a question "If we do this, is it actually backwards compatible the
>>>>> way we want/need?")
>>>>>
>>>>> Anyway, I think the BrinDesc scratch space is a neat idea, I'll try
>>>>> doing it that way and report back in a couple days.
>>>>
>>>> Cool. In 0005-Support-SK_SEARCHARRAY-in-BRIN-bloom-20230702.patch, you
>>>> used the preprocess function to pre-calculate the scankey's hash, even
>>>> for scalars. You could use the scratch space in BrinDesc for that,
>>>> before doing anything with SEARCHARRAYs.
>>>>
>>>
>>> Yeah, that's a good idea.
>>>
>>
>> I started looking at this (the scratch space in BrinDesc), and it's not
>> as straightforward. The trouble is BrinDesc is "per attribute" but the
>> scratch space is "per scankey" (because we'd like to sort values from
>> the scankey array).
>>
>> With the "new" consistent functions (that get all scan keys at once)
>> this probably is not an issue, because we know which scan key we're
>> processing and so we can map it to the scratch space. But with the old
>> consistent function that's not the case. Maybe we should support this
>> only with the "new" consistent function variant?
>>
>> This would however conflict with the idea to have a separate consistent
>> function for arrays, which "splits" the scankeys into multiple groups
>> again. There could be multiple SAOP scan keys, and then what?
>>
>> I wonder if the scratch space should be in the ScanKey instead?
> 
> Are we planning to post an updated patch for this? If the interest has
> gone down and if there are no plans to handle this I'm thinking of
> returning this commitfest entry in this commitfest and can be opened
> when there is more interest.
> 

I still think the patch is a good idea and plan to get back to it, but
probably not in this CF. Given that the last update if from July, it's
fair to bump it - either RWF or just move to the next CF. Up to you.

As for the patch, I wonder if Heikki has some idea what to do about the
scratch space? I got stuck on thinking about how to do this with the two
types of consistent functions we support/allow.

To articulate the issue more clearly - the scratch space is "per index"
but we need scratch space "per index key". That's fine - we can simply
have pointers to multiple scratch spaces, I think.

But we have two ways to do consistent functions - the "old" gets scan
keys one attribute at a time, "new" gets all at once. For the "new" it's
not a problem, it's simple to identify the right scratch space. But for
the "old" one, how would that happen? The consistent function has no
idea which index key it's operating on, and how to identify the correct
scratch space.

I can think of two ways to deal with this:

1) Only allow SK_SEARCHARRAY for indexes supporting new-style consistent
functions (but I'm not sure how, considering amsearcharray is set way
before we know what the opclass does, or whether it implements the old
or new consistent function).

2) Allow SK_SEARCHARRAY even with old consistent function, but do some
dance in bringetbitmap() so to set the scratch space accordingly before
the call.

Now that I read Heikki's messages again, I see he suggested assigning a
new procnum to a consistent function supporting SK_SEARCHARRAY, which
seems to be very close to (1). Except that we'd now have 3 ways to
define a consistent function, and that sounds a bit ... too much?

Anyway, thinking about (1), I'm still not sure what to do about existing
opclasses - it'd be nice to have some backwards compatible solution,
without breaking everything and forcing everyone to implement all the
new stuff. Which is kinda why we already have two ways to do consistent
functions. Presumably we'd have to implement some "default" handling by
translating the SK_SEARCHARRAY key into simple equality keys ...


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: BRIN indexes vs. SK_SEARCHARRAY (and preprocessing scan keys)

От
vignesh C
Дата:
On Mon, 15 Jan 2024 at 04:45, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 1/14/24 12:18, vignesh C wrote:
> > On Fri, 14 Jul 2023 at 20:17, Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> >>
> >> On 7/9/23 23:44, Tomas Vondra wrote:
> >>> ...
> >>>>> Yes, my previous message was mostly about backwards compatibility, and
> >>>>> this may seem a bit like an argument against it. But that message was
> >>>>> more a question "If we do this, is it actually backwards compatible the
> >>>>> way we want/need?")
> >>>>>
> >>>>> Anyway, I think the BrinDesc scratch space is a neat idea, I'll try
> >>>>> doing it that way and report back in a couple days.
> >>>>
> >>>> Cool. In 0005-Support-SK_SEARCHARRAY-in-BRIN-bloom-20230702.patch, you
> >>>> used the preprocess function to pre-calculate the scankey's hash, even
> >>>> for scalars. You could use the scratch space in BrinDesc for that,
> >>>> before doing anything with SEARCHARRAYs.
> >>>>
> >>>
> >>> Yeah, that's a good idea.
> >>>
> >>
> >> I started looking at this (the scratch space in BrinDesc), and it's not
> >> as straightforward. The trouble is BrinDesc is "per attribute" but the
> >> scratch space is "per scankey" (because we'd like to sort values from
> >> the scankey array).
> >>
> >> With the "new" consistent functions (that get all scan keys at once)
> >> this probably is not an issue, because we know which scan key we're
> >> processing and so we can map it to the scratch space. But with the old
> >> consistent function that's not the case. Maybe we should support this
> >> only with the "new" consistent function variant?
> >>
> >> This would however conflict with the idea to have a separate consistent
> >> function for arrays, which "splits" the scankeys into multiple groups
> >> again. There could be multiple SAOP scan keys, and then what?
> >>
> >> I wonder if the scratch space should be in the ScanKey instead?
> >
> > Are we planning to post an updated patch for this? If the interest has
> > gone down and if there are no plans to handle this I'm thinking of
> > returning this commitfest entry in this commitfest and can be opened
> > when there is more interest.
> >
>
> I still think the patch is a good idea and plan to get back to it, but
> probably not in this CF. Given that the last update if from July, it's
> fair to bump it - either RWF or just move to the next CF. Up to you.

I have changed the status to RWF, feel free to update the commitfest
after handling the comments.

Regards,
Vignesh