Обсуждение: [PATCH] Add tests for Bitmapset

Поиск
Список
Период
Сортировка

[PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
Hello,

I noticed that there are no tests for Bitmapset in src/test/modules as
is the case for other similar things like radixtree, rbtree, etc. so I
created one.  I realize that Bitmapset is already "tested" by all the
other code that uses it, but I was able to find one minor oversight[1]
in that code with these new tests.

I hope I've covered all the bases, but if you have thoughts on other
ways to test Bitmapset I'll happily add them to the patch.

best.

-greg

[1] https://www.postgresql.org/message-id/2000A717-1FFE-4031-827B-9330FB2E9065%40getmailspring.com

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Nathan Bossart
Дата:
On Fri, Aug 15, 2025 at 11:39:23AM -0400, Greg Burd wrote:
> I noticed that there are no tests for Bitmapset in src/test/modules as
> is the case for other similar things like radixtree, rbtree, etc. so I
> created one.  I realize that Bitmapset is already "tested" by all the
> other code that uses it, but I was able to find one minor oversight[1]
> in that code with these new tests.
> 
> I hope I've covered all the bases, but if you have thoughts on other
> ways to test Bitmapset I'll happily add them to the patch.

Adding some tests here seems like a good idea.  I might look into some ways
to trim it down a bit, but that'd just be minor editorialization.  One
other thing to consider is adding randomness to the tests (see
test_radixtree and test_binaryheap for examples).

-- 
nathan



Re: [PATCH] Add tests for Bitmapset

От
"Burd, Greg"
Дата:
> On Sep 4, 2025, at 10:00 PM, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Fri, Aug 15, 2025 at 11:39:23AM -0400, Greg Burd wrote:
>> I noticed that there are no tests for Bitmapset in src/test/modules as
>> is the case for other similar things like radixtree, rbtree, etc. so I
>> created one.  I realize that Bitmapset is already "tested" by all the
>> other code that uses it, but I was able to find one minor oversight[1]
>> in that code with these new tests.
>>
>> I hope I've covered all the bases, but if you have thoughts on other
>> ways to test Bitmapset I'll happily add them to the patch.
>
> Adding some tests here seems like a good idea.  I might look into some ways
> to trim it down a bit, but that'd just be minor editorialization.  One
> other thing to consider is adding randomness to the tests (see
> test_radixtree and test_binaryheap for examples).

Nathan,

Thank you for your interest in this patch, I appreciate that your time is
limited and highly valuable to the community.  This patch isn't "earth
shattering", but I think it's valuable to have test coverage even in cases
where the code being tested is already very well exercised.

I looked at both radix tree and binary heap and how they use random sets when
testing.  Binary heap uses it to create different random sets of numbers to
use across multiple tests while radix tree has a single function that focuses
on randomized data.  I decided not to add randomization into the tests of
Bitmapset simply because I like avoiding non-deterministic behavior.  But in
tests I guess that can be helpful finding future unknown corner cases.  I'm
on the fence as to the value, your call.  :)

Let me know if you'd like that or not.

>
> --
> Nathan

best, and thanks again for the attention,

-greg




Re: [PATCH] Add tests for Bitmapset

От
Nathan Bossart
Дата:
On Fri, Sep 05, 2025 at 10:48:21AM -0400, Burd, Greg wrote:
> I looked at both radix tree and binary heap and how they use random sets when
> testing.  Binary heap uses it to create different random sets of numbers to
> use across multiple tests while radix tree has a single function that focuses
> on randomized data.  I decided not to add randomization into the tests of
> Bitmapset simply because I like avoiding non-deterministic behavior.  But in
> tests I guess that can be helpful finding future unknown corner cases.  I'm
> on the fence as to the value, your call.  :)

I'm not too concerned about it.  We've lived without a dedicated test suite
for Bitmapset for a very long time, so any amount of test coverage is an
improvement.  Like you said, adding some randomization might be helpful for
finding weird bugs we wouldn't have thought to test.  And, given the many,
many machines that run the tests, IMHO it'd only help build even more
confidence in the code.  If my suggestion inspires you to update the patch,
great, but I'm fine with proceeding with what you already wrote, too.

-- 
nathan



Re: [PATCH] Add tests for Bitmapset

От
"Burd, Greg"
Дата:
> On Sep 5, 2025, at 2:43 PM, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Fri, Sep 05, 2025 at 10:48:21AM -0400, Burd, Greg wrote:
>> I looked at both radix tree and binary heap and how they use random sets when
>> testing.  Binary heap uses it to create different random sets of numbers to
>> use across multiple tests while radix tree has a single function that focuses
>> on randomized data.  I decided not to add randomization into the tests of
>> Bitmapset simply because I like avoiding non-deterministic behavior.  But in
>> tests I guess that can be helpful finding future unknown corner cases.  I'm
>> on the fence as to the value, your call.  :)
>
> I'm not too concerned about it.  We've lived without a dedicated test suite
> for Bitmapset for a very long time, so any amount of test coverage is an
> improvement.  Like you said, adding some randomization might be helpful for
> finding weird bugs we wouldn't have thought to test.  And, given the many,
> many machines that run the tests, IMHO it'd only help build even more
> confidence in the code.  If my suggestion inspires you to update the patch,
> great, but I'm fine with proceeding with what you already wrote, too.

Nathan, thanks for considering the patch.  Honestly, I'm fine with it as is.
We can revisit later if needed.  This does what I'd intended, test and document
in code the API and implementation making future changes to that more
transparent.

Thanks!

-greg

--
> nathan




Re: [PATCH] Add tests for Bitmapset

От
Masahiko Sawada
Дата:
On Mon, Sep 8, 2025 at 11:21 AM Burd, Greg <greg@burd.me> wrote:
>
>
> > On Sep 5, 2025, at 2:43 PM, Nathan Bossart <nathandbossart@gmail.com> wrote:
> >
> > On Fri, Sep 05, 2025 at 10:48:21AM -0400, Burd, Greg wrote:
> >> I looked at both radix tree and binary heap and how they use random sets when
> >> testing.  Binary heap uses it to create different random sets of numbers to
> >> use across multiple tests while radix tree has a single function that focuses
> >> on randomized data.  I decided not to add randomization into the tests of
> >> Bitmapset simply because I like avoiding non-deterministic behavior.  But in
> >> tests I guess that can be helpful finding future unknown corner cases.  I'm
> >> on the fence as to the value, your call.  :)
> >
> > I'm not too concerned about it.  We've lived without a dedicated test suite
> > for Bitmapset for a very long time, so any amount of test coverage is an
> > improvement.  Like you said, adding some randomization might be helpful for
> > finding weird bugs we wouldn't have thought to test.  And, given the many,
> > many machines that run the tests, IMHO it'd only help build even more
> > confidence in the code.  If my suggestion inspires you to update the patch,
> > great, but I'm fine with proceeding with what you already wrote, too.
>
> Nathan, thanks for considering the patch.  Honestly, I'm fine with it as is.
> We can revisit later if needed.  This does what I'd intended, test and document
> in code the API and implementation making future changes to that more
> transparent.

I appreciate your work on this. While I agree that adding more tests
to  bitmapset.c is a good idea, I'm concerned about the minimal
improvement in test coverage despite the addition of new test cases
(only three lines of code are newly covered). Apart from adding some
randomness to the tests we've discussed, given that we're implementing
a dedicated test module for bitmapset.c, I would expect to see a more
increase in test coverage.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: [PATCH] Add tests for Bitmapset

От
"Greg Burd"
Дата:
On Thu, Sep 11, 2025, at 2:48 AM, Masahiko Sawada wrote:
> On Mon, Sep 8, 2025 at 11:21 AM Burd, Greg <greg@burd.me> wrote:
>>
>>
>> > On Sep 5, 2025, at 2:43 PM, Nathan Bossart <nathandbossart@gmail.com> wrote:
>> >
>> > On Fri, Sep 05, 2025 at 10:48:21AM -0400, Burd, Greg wrote:
>> >> I looked at both radix tree and binary heap and how they use random sets when
>> >> testing.  Binary heap uses it to create different random sets of numbers to
>> >> use across multiple tests while radix tree has a single function that focuses
>> >> on randomized data.  I decided not to add randomization into the tests of
>> >> Bitmapset simply because I like avoiding non-deterministic behavior.  But in
>> >> tests I guess that can be helpful finding future unknown corner cases.  I'm
>> >> on the fence as to the value, your call.  :)
>> >
>> > I'm not too concerned about it.  We've lived without a dedicated test suite
>> > for Bitmapset for a very long time, so any amount of test coverage is an
>> > improvement.  Like you said, adding some randomization might be helpful for
>> > finding weird bugs we wouldn't have thought to test.  And, given the many,
>> > many machines that run the tests, IMHO it'd only help build even more
>> > confidence in the code.  If my suggestion inspires you to update the patch,
>> > great, but I'm fine with proceeding with what you already wrote, too.
>>
>> Nathan, thanks for considering the patch.  Honestly, I'm fine with it as is.
>> We can revisit later if needed.  This does what I'd intended, test and document
>> in code the API and implementation making future changes to that more
>> transparent.
>
> I appreciate your work on this. While I agree that adding more tests
> to  bitmapset.c is a good idea, I'm concerned about the minimal
> improvement in test coverage despite the addition of new test cases
> (only three lines of code are newly covered). Apart from adding some
> randomness to the tests we've discussed, given that we're implementing
> a dedicated test module for bitmapset.c, I would expect to see a more
> increase in test coverage.

Good point and thanks for taking the time to reply.  The only thing it identified
was a small issue fixed in b463288 so I'd not expect coverage to increase much.

I'll take a stab at adding some randomness to the tests and check the delta in
coverage.  Thanks for the nudge.  :)

>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com

Just for reference I started this not to increase coverage, which is a good
goal just not the one I had.  I was reviewing the API and considering some
changes based on other work I've done.  Now that I see how deeply baked in
this code is I think that's unlikely.  Maybe something else distinct for
bitmaps over 64-bit space at some point will be useful.  I wrote this code
just to capture the API in test form.

best,

-greg



Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Thu, Sep 11, 2025 at 06:56:07AM -0400, Greg Burd wrote:
> Just for reference I started this not to increase coverage, which is a good
> goal just not the one I had.  I was reviewing the API and considering some
> changes based on other work I've done.  Now that I see how deeply baked in
> this code is I think that's unlikely.  Maybe something else distinct for
> bitmaps over 64-bit space at some point will be useful.  I wrote this code
> just to capture the API in test form.

How much does this measure in terms of numbers produced by
coverage-html (see [1])?  The paths taken don't always matter as it
can also be important to check combinations of code paths that are
taken by other tests when checking after edge cases, but that would
give an idea of gain vs extra runtime.  Not objecting to your patch,
just being curious as I am not seeing any numbers posted on this
thread.

[1]: https://www.postgresql.org/docs/devel/regress-coverage.html
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
"Greg Burd"
Дата:

On Sep 11, 2025, at 9:36 PM, Michael Paquier <michael@paquier.xyz> wrote:

On Thu, Sep 11, 2025 at 06:56:07AM -0400, Greg Burd wrote:
Just for reference I started this not to increase coverage, which is a good
goal just not the one I had.  I was reviewing the API and considering some
changes based on other work I've done.  Now that I see how deeply baked in
this code is I think that's unlikely.  Maybe something else distinct for
bitmaps over 64-bit space at some point will be useful.  I wrote this code
just to capture the API in test form.

How much does this measure in terms of numbers produced by
coverage-html (see [1])?  The paths taken don't always matter as it
can also be important to check combinations of code paths that are
taken by other tests when checking after edge cases, but that would
give an idea of gain vs extra runtime.  Not objecting to your patch,
just being curious as I am not seeing any numbers posted on this
thread.

[1]: https://www.postgresql.org/docs/devel/regress-coverage.html
--
Michael


Sawada-san, Michael,

Thank you both for the push to measure.  Before the patch as it stands now the
coverage for src/backend/nodes/bitmapset.c is 63.5% and after it is 66.5%.  Not
an amazing difference, but something.  I guess I expected this to be higher given
the degree to which this datatype is used.

I'll review the gaps in coverage and update the tests.  I'll look for a way to add
meaningful randomization.

-greg

Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 13 2025, at 10:23 am, Greg Burd <greg@burd.me> wrote:
>
> Sawada-san, Michael,
>
> Thank you both for the push to measure.  Before the patch as it stands
> now the
> coverage for src/backend/nodes/bitmapset.c is 63.5% and after it is
> 66.5%.  Not
> an amazing difference, but something.  I guess I expected this to be
> higher given
> the degree to which this datatype is used.
>
> I'll review the gaps in coverage and update the tests.  I'll look for
> a way to add meaningful randomization.
>
> -greg

Hello hackers,

Here's the progress I've made in coverage for testing Bitmapset.

coverage:      HEAD        v1        v2
  lines......: 87.1        87.7      91.3
  functions..: 100         100       100
  branches...: 63.5        66.5      77.9

I've also added a function that performs a variety of operations using
random data.

For reference radixtree has:

coverage:      HEAD
  lines......: 98.3
  functions..: 97.2
  branches...: 89.4

best,

-greg

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Robert Haas
Дата:
On Mon, Sep 15, 2025 at 2:00 PM Greg Burd <greg@burd.me> wrote:
> For reference radixtree has:
>
> coverage:      HEAD
>   lines......: 98.3
>   functions..: 97.2
>   branches...: 89.4

+ /* Test negative member in bms_make_singleton */
+ error_caught = false;
+ PG_TRY();
+ {
+ bms_make_singleton(-1);
+ }
+ PG_CATCH();
+ {
+ error_caught = true;
+ FlushErrorState();
+ }
+ PG_END_TRY();
+ EXPECT_TRUE(error_caught);

This is an anti-pattern for PostgreSQL code. You can't just flush an
error without aborting a transaction or subtransaction to recover.
Even if it could be shown that this were harmless here, I think it's a
terrible idea to have code like this in the tree, as it encourages
people to do exactly the wrong thing.

But backing up a step, this also doesn't really seem like the right
way to test the error conditions. It deliberately throws away the
error message. All this verifies is that you caught an error. If you
let the error escape to the client you could have the expected output
test that you got the expected message.

I think it would be a better idea to structure this as a set of
SQL-callable functions and move a bunch of the logic into SQL.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:

On Sep 15 2025, at 2:54 pm, Robert Haas <robertmhaas@gmail.com> wrote:

> On Mon, Sep 15, 2025 at 2:00 PM Greg Burd <greg@burd.me> wrote:
>> For reference radixtree has:
>>
>> coverage:      HEAD
>>   lines......: 98.3
>>   functions..: 97.2
>>   branches...: 89.4
>
> + /* Test negative member in bms_make_singleton */
> + error_caught = false;
> + PG_TRY();
> + {
> + bms_make_singleton(-1);
> + }
> + PG_CATCH();
> + {
> + error_caught = true;
> + FlushErrorState();
> + }
> + PG_END_TRY();
> + EXPECT_TRUE(error_caught);
>
> This is an anti-pattern for PostgreSQL code. You can't just flush an
> error without aborting a transaction or subtransaction to recover.
> Even if it could be shown that this were harmless here, I think it's a
> terrible idea to have code like this in the tree, as it encourages
> people to do exactly the wrong thing.

Fair enough, I'll rework it.

> But backing up a step, this also doesn't really seem like the right
> way to test the error conditions. It deliberately throws away the
> error message. All this verifies is that you caught an error. If you
> let the error escape to the client you could have the expected output
> test that you got the expected message.
>
> I think it would be a better idea to structure this as a set of
> SQL-callable functions and move a bunch of the logic into SQL.

I'll give that approach a try, thanks for the suggestion.

-greg

> --
> Robert Haas
> EDB: http://www.enterprisedb.com



Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 15 2025, at 3:03 pm, Greg Burd <greg@burd.me> wrote:


> On Sep 15 2025, at 2:54 pm, Robert Haas <robertmhaas@gmail.com> wrote:
>
>> On Mon, Sep 15, 2025 at 2:00 PM Greg Burd <greg@burd.me> wrote:
>>> For reference radixtree has:
>>>
>>> coverage:      HEAD
>>>   lines......: 98.3
>>>   functions..: 97.2
>>>   branches...: 89.4
>>
>> + /* Test negative member in bms_make_singleton */
>> + error_caught = false;
>> + PG_TRY();
>> + {
>> + bms_make_singleton(-1);
>> + }
>> + PG_CATCH();
>> + {
>> + error_caught = true;
>> + FlushErrorState();
>> + }
>> + PG_END_TRY();
>> + EXPECT_TRUE(error_caught);
>>
>> This is an anti-pattern for PostgreSQL code. You can't just flush an
>> error without aborting a transaction or subtransaction to recover.
>> Even if it could be shown that this were harmless here, I think it's a
>> terrible idea to have code like this in the tree, as it encourages
>> people to do exactly the wrong thing.
>
> Fair enough, I'll rework it.
>
>> But backing up a step, this also doesn't really seem like the right
>> way to test the error conditions. It deliberately throws away the
>> error message. All this verifies is that you caught an error. If you
>> let the error escape to the client you could have the expected output
>> test that you got the expected message.
>>
>> I think it would be a better idea to structure this as a set of
>> SQL-callable functions and move a bunch of the logic into SQL.
>
> I'll give that approach a try, thanks for the suggestion.
>
> -greg

Robert,

Reworked as indicated, thanks for pointing out the anti-pattern I'd
missed.

best.

-greg

>> --
>> Robert Haas
>> EDB: http://www.enterprisedb.com

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Mon, Sep 15, 2025 at 03:56:30PM -0400, Greg Burd wrote:
> Reworked as indicated, thanks for pointing out the anti-pattern I'd
> missed.

Hmm.  Should we try to get something closer in shape to what we do for
the SLRU tests, with one SQL function mapping to each C function we
are testing?  This counts for bms_is_member(), bms_num_members(), add,
del, mul, hash and make_singleton.

+test_bms_del_member_negative(PG_FUNCTION_ARGS)
+{
+    /* This should throw an error for negative member */
+    bms_del_member(NULL, -20);
+    PG_RETURN_VOID();

For example, this case could use a Bitmapset and a number (or an array
of numbers for repeated operations), with NULL being one option for
the input Bitmapset.  This makes the tests a bit more representative
of what they do at SQL level, hence there would be no need to
double-check a given line where a failure happens.  The Bitmapset
could then be passed around as bytea blobs using \gset, for example.
This should not be a problem as long as they are palloc'd in the
TopMemoryContext.  The SQL output for true/false/NULL/non-NULL
generated as output of the test could then be used instead of the
EXPECT_*() macros reported then in the elogs.

Perhaps my view of things is too cute and artistic, but when these
test suites grow over time and differ across branches, having to dig
into a specific line of a specific branch to get what a problem is
about is more error-prone.  Particularly annoying when calling one
single function that does a lot of various actions, like the proposed
test_bitmapset().

Side note: `git diff --check` is reporting a bunch of indents with
spaces around the EXPECT macros.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Robert Haas
Дата:
On Tue, Sep 16, 2025 at 2:04 AM Michael Paquier <michael@paquier.xyz> wrote:
> one SQL function mapping to each C function we are testing?

Yes, I think we should do this, if possible.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 16 2025, at 8:02 am, Robert Haas <robertmhaas@gmail.com> wrote:

> On Tue, Sep 16, 2025 at 2:04 AM Michael Paquier <michael@paquier.xyz> wrote:
>> one SQL function mapping to each C function we are testing?
>
> Yes, I think we should do this, if possible.

Michael, Robert,

Thanks for your time reviewing the proposed code and for providing
feedback. This patch started life as simply copy of the test_radixtree
module which uses a single test function triggered in SQL, hence my
approach.  I guess I should have copied the test_slru module instead! :)

I see the value in the idea of splitting it up into separate functions
and I'll give that a try.

Michael, apologies for the white space issues.  I'll clean that up, and
thank you for the `git diff --check` tip, I'll add that to my toolbox/routine.

> --
> Robert Haas
> EDB: http://www.enterprisedb.com

best.

-greg




Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
I've re-written the set of tests as suggested (I hope!).  I've not
re-run the coverage report (yet) to ensure we're at least as well
covered as v3, but attached is v4 for your inspection (amusement?). 
I've tried to author the SQL tests such that failures clearly indicate
what's at gone wrong.

This exercise turned into a lot more LOC than I'd expected, I hope it
provides some value to the community for testing this important data
structure and helps to codify the API clearly.

best.

-greg

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Masahiko Sawada
Дата:
On Tue, Sep 16, 2025 at 12:04 PM Greg Burd <greg@burd.me> wrote:
>
> I've re-written the set of tests as suggested (I hope!).  I've not
> re-run the coverage report (yet) to ensure we're at least as well
> covered as v3, but attached is v4 for your inspection (amusement?).
> I've tried to author the SQL tests such that failures clearly indicate
> what's at gone wrong.
>
> This exercise turned into a lot more LOC than I'd expected, I hope it
> provides some value to the community for testing this important data
> structure and helps to codify the API clearly.

Thank you for updating the patch. It seems cfbot caught a regression
test error[1] in a 32-bit build.

Regards,

[1] https://cirrus-ci.com/task/5290864655728640

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Tue, Sep 16, 2025 at 03:00:40PM -0700, Masahiko Sawada wrote:
> Thank you for updating the patch. It seems cfbot caught a regression
> test error[1] in a 32-bit build.

- bitmap_hash [1,3,5] |   49870778
+ bitmap_hash [1,3,5] | 1509752520

This one is able the hash value computation being not portable across
architectures.  I would just change these to be non-NULL, I guess.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Tue, Sep 16, 2025 at 03:04:39PM -0400, Greg Burd wrote:
> I've re-written the set of tests as suggested (I hope!).  I've not
> re-run the coverage report (yet) to ensure we're at least as well
> covered as v3, but attached is v4 for your inspection (amusement?).
> I've tried to author the SQL tests such that failures clearly indicate
> what's at gone wrong.

Thanks for the update, that's easier to follow, even if it's more
code overall.  It looks like all the 29 APIs are present, plus the
extras for the array mapping routines required for the tests with the
list APIs.

+static void
+elog_bitmapset(int elevel, const char *label, const Bitmapset *bms)

This routine, that's able to translate a bytea into a readable form,
is also something that I would have added as an extra function, then
used it in some of the tests where I would care about the bitmap in
output generated after a given operation, removing the DEBUG parts
that I guess have also a risk of rotting over time.  That would make
the tests for simple operations that generate a bms as a result in
easier to read, because we would call less SQL calls.  For example,
take the "add_member idempotent" case, that calls twice
test_bms_make_singleton() to compare the output byteas, we could just
do a test_bms_add_member(test_bms_make_singleton(10), 10) and show the
output?

We could perhaps use the CTE style for the basic calls, to reduce the
test_bms_make_singleton() calls.  What you are doing feels OK as well,
just a thought in passing.

FWIW, I tend to write the regression test queries so as these are
limited to 72-80 characters per line, to fit even in a small
terminals.  That's just an old-school take, even if it means more
lines in the SQL input file..

Perhaps it would be worth documenting what's the value of
test_random_operations?  It is a mean of testing add, del and
is_member with a modulo 1000 of the member number.  With the other
functions that offer deterministic tests, I am not sure what's the
extra value gained in it.

+        bms1_data = PG_GETARG_BYTEA_PP(0);
+        if (VARSIZE_ANY_EXHDR(bms1_data) > 0)
+        {
+            bms1 = (Bitmapset *) palloc(VARSIZE_ANY_EXHDR(bms1_data));
+            memcpy(bms1, VARDATA_ANY(bms1_data), VARSIZE_ANY_EXHDR(bms1_data));
+        }

This pattern is repeated 9 times, if my fingers got it right.  Perhaps
hide it in a macro, or invent an extra static inline routine that does
the translation?

+DO $$
+BEGIN
+    BEGIN
+    PERFORM test_bms_singleton_member(test_bms_from_array(ARRAY[1,2]));
+    RAISE NOTICE 'ERROR: singleton_member([1,2]) should have failed!';

Do we need all these DO blocks to catch failures?  These cases bump on
elog(ERROR) in the internals of bitmapset.c, which are not something
the end-users would expect, but in a test module that's fine to
trigger.  The function call would just fail, which is fine.

The queries with CTEs and UNION ALL are elegant, nice.

> This exercise turned into a lot more LOC than I'd expected, I hope it
> provides some value to the community for testing this important data
> structure and helps to codify the API clearly.

It does, at least that's my opinion.  So thanks for caring about this
level of testing.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 16 2025, at 8:35 pm, Michael Paquier <michael@paquier.xyz> wrote:

> On Tue, Sep 16, 2025 at 03:00:40PM -0700, Masahiko Sawada wrote:
>> Thank you for updating the patch. It seems cfbot caught a regression
>> test error[1] in a 32-bit build.
> 
> - bitmap_hash [1,3,5] |   49870778
> + bitmap_hash [1,3,5] | 1509752520
> 
> This one is able the hash value computation being not portable across
> architectures.  I would just change these to be non-NULL, I guess.
> --
> Michael

Thanks Michael, Sawada-san, yep hashing is arch dependent (go figure!),
but thankfully it is deterministic. Rather that test for non-NULL I've
come up with this approach:

-- Architecture-aware hash testing
WITH arch_info AS (
    SELECT
        CASE
            WHEN pg_column_size(1::bigint) = 8 THEN '64bit'
            ELSE '32bit'
        END as architecture
),
expected_values AS (
    SELECT
        architecture,
        CASE architecture
            WHEN '64bit' THEN 0
            WHEN '32bit' THEN 0
        END as hash_null,
        CASE architecture
            WHEN '64bit' THEN 49870778
            WHEN '32bit' THEN 1509752520
        END as hash_135,
        CASE architecture
            WHEN '64bit' THEN -303921606
            WHEN '32bit' THEN 0  -- TBD
        END as hash_246
    FROM arch_info
)
SELECT 'expected hash NULL' as test,
       test_bitmap_hash(NULL) = hash_null as result
FROM expected_values
UNION ALL
SELECT 'expected hash [1,3,5]' as test,
       test_bitmap_hash(test_bms_from_array(ARRAY[1,3,5])) = hash_135 as result
FROM expected_values
UNION ALL
SELECT 'expected hash [2,4,6]' as test,
       test_bitmap_hash(test_bms_from_array(ARRAY[2,4,6])) = hash_246 as result
FROM expected_values;

However I'm not sure what the value is for testing hash functions except
for coverage and I'm fairly certain that function is well exercised. 
That said, I think that will work.  I'll let cfbot tell me the 32bit
hash value on the next run.

thanks again for the helpful insights,

-greg




Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 16 2025, at 11:25 pm, Michael Paquier <michael@paquier.xyz> wrote:

> On Tue, Sep 16, 2025 at 03:04:39PM -0400, Greg Burd wrote:
>> I've re-written the set of tests as suggested (I hope!).  I've not
>> re-run the coverage report (yet) to ensure we're at least as well
>> covered as v3, but attached is v4 for your inspection (amusement?). 
>> I've tried to author the SQL tests such that failures clearly indicate
>> what's at gone wrong.
> 
> Thanks for the update, that's easier to follow, even if it's more
> code overall.  It looks like all the 29 APIs are present, plus the
> extras for the array mapping routines required for the tests with the
> list APIs.

Thank you for the continued detailed feedback, I think we're zeroing in
on a solid patch.

> +static void
> +elog_bitmapset(int elevel, const char *label, const Bitmapset *bms)
> 
> This routine, that's able to translate a bytea into a readable form,
> is also something that I would have added as an extra function, then
> used it in some of the tests where I would care about the bitmap in
> output generated after a given operation, removing the DEBUG parts
> that I guess have also a risk of rotting over time.  That would make
> the tests for simple operations that generate a bms as a result in
> easier to read, because we would call less SQL calls.  For example,
> take the "add_member idempotent" case, that calls twice
> test_bms_make_singleton() to compare the output byteas, we could just
> do a test_bms_add_member(test_bms_make_singleton(10), 10) and show the
> output?

I added a function bms_values() that prints out the values contained in
a bitmapset and changed a few tests to use it.  I removed the
elog_bitmapset() as it isn't being used and now that it's easy to output
the set in SQL why keep it for debugging?

> We could perhaps use the CTE style for the basic calls, to reduce the
> test_bms_make_singleton() calls.  What you are doing feels OK as well,
> just a thought in passing.

I changed a few tests I saw with the pattern you mentioned. If you identify
additional tests you feel should be re-written please let me know.

> FWIW, I tend to write the regression test queries so as these are
> limited to 72-80 characters per line, to fit even in a small
> terminals.  That's just an old-school take, even if it means more
> lines in the SQL input file.

I have a vt-100 sitting behind me, I get it.  I tend to try to line-wrap
at 78.  I tried a few times to re-format the SQL accordingly and I found
that it became harder to understand and less readable IMO. I'd prefer to
keep it as is, unless there is a hard requirement in which case I'll
reformat it.

> Perhaps it would be worth documenting what's the value of
> test_random_operations?  It is a mean of testing add, del and
> is_member with a modulo 1000 of the member number.  With the other
> functions that offer deterministic tests, I am not sure what's the
> extra value gained in it.

That was my thinking and why I didn't have a test that used random data
to start with (v1).  I'd rather just remove it, but I'm open to ideas. 
I could replace it with the earlier version of the function.

> +        bms1_data = PG_GETARG_BYTEA_PP(0);
> +        if (VARSIZE_ANY_EXHDR(bms1_data) > 0)
> +        {
> +            bms1 = (Bitmapset *) palloc(VARSIZE_ANY_EXHDR(bms1_data));
> +            memcpy(bms1, VARDATA_ANY(bms1_data), VARSIZE_ANY_EXHDR(bms1_data));
> +        }
> 
> This pattern is repeated 9 times, if my fingers got it right.  Perhaps
> hide it in a macro, or invent an extra static inline routine that does
> the translation?

That was an oversight on my part, all of those have been reduced to use
the same pattern with encode/decode bms() functions.

> +DO $$
> +BEGIN
> +    BEGIN
> +    PERFORM test_bms_singleton_member(test_bms_from_array(ARRAY[1,2]));
> +    RAISE NOTICE 'ERROR: singleton_member([1,2]) should have failed!';
> 
> Do we need all these DO blocks to catch failures?  These cases bump on
> elog(ERROR) in the internals of bitmapset.c, which are not something
> the end-users would expect, but in a test module that's fine to
> trigger.  The function call would just fail, which is fine.

Sure, that's easy enough.

> The queries with CTEs and UNION ALL are elegant, nice.

Thanks.

>> This exercise turned into a lot more LOC than I'd expected, I hope it
>> provides some value to the community for testing this important data
>> structure and helps to codify the API clearly.
> 
> It does, at least that's my opinion.  So thanks for caring about this
> level of testing.

Again, thank YOU(!) for caring enough to review the work.

> --
> Michael

Don't forget, we expect the cfbot/CI to fail on 32bit systems and tell
us the hash value we need to record in the next version of the patch. 
Or, if we don't think testing the hash functions is worth it (and I'm on
the fence on this one) then I can just remove it.

best.

-greg

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Robert Haas
Дата:
On Wed, Sep 17, 2025 at 9:18 AM Greg Burd <greg@burd.me> wrote:
> > +static void
> > +elog_bitmapset(int elevel, const char *label, const Bitmapset *bms)
> >
>
> I added a function bms_values() that prints out the values contained in
> a bitmapset and changed a few tests to use it.  I removed the
> elog_bitmapset() as it isn't being used and now that it's easy to output
> the set in SQL why keep it for debugging?

How about using outBitmapset() which already exists?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 17 2025, at 9:55 am, Robert Haas <robertmhaas@gmail.com> wrote:

> On Wed, Sep 17, 2025 at 9:18 AM Greg Burd <greg@burd.me> wrote:
>> > +static void
>> > +elog_bitmapset(int elevel, const char *label, const Bitmapset *bms)
>> >
>>
>> I added a function bms_values() that prints out the values contained in
>> a bitmapset and changed a few tests to use it.  I removed the
>> elog_bitmapset() as it isn't being used and now that it's easy to output
>> the set in SQL why keep it for debugging?
>
> How about using outBitmapset() which already exists?

Robert,

Thanks for chiming in, not sure how I missed that one. :)

> --
> Robert Haas
> EDB: http://www.enterprisedb.com

I've updated the hash value for 32bit systems and changed the
bms_values() function to use outBitmapset() via bmsToString().

best.

-greg
Вложения

Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Wed, Sep 17, 2025 at 10:53:26AM -0400, Greg Burd wrote:
> I've updated the hash value for 32bit systems and changed the
> bms_values() function to use outBitmapset() via bmsToString().

Oh, right, I also forgot that we have node functions for that.

And actually, could it be better to use nodeRead() to translate the
inputs?  I've just remembered that we have some custom code in there
with other lists, which is documented for the purpose of extensions
with extensible nodes.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
Hello hackers, :)

Thanks for all the feedback and time spent reviewing this patch.  I
switched out the encode/decode functions to use the nodeToString() and
stringToNode() functions and change all the SQL testing function
signatures to TEXT from BYTEA.  This exercises more code and that's good
for testing purposes.  I took out the code that checks the hash values,
I can't think of a reason that should cause a future failure should that
change and output different values.  I re-integrated the earlier random
testing function along with the newer code.  I'm not sure this buys us
much if anything.

coverage:      HEAD      v7
lines......:   87.1    90.5   +3.4%
functions..:  100.0   100.0   +0.0%
branches...:   63.5    75.9  +12.4%

best.

-greg

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Thu, Sep 18, 2025 at 02:50:45PM -0400, Greg Burd wrote:
> Thanks for all the feedback and time spent reviewing this patch.  I
> switched out the encode/decode functions to use the nodeToString() and
> stringToNode() functions and change all the SQL testing function
> signatures to TEXT from BYTEA.  This exercises more code and that's good
> for testing purposes.  I took out the code that checks the hash values,
> I can't think of a reason that should cause a future failure should that
> change and output different values.  I re-integrated the earlier random
> testing function along with the newer code.  I'm not sure this buys us
> much if anything.
>
> coverage:      HEAD      v7
> lines......:   87.1    90.5   +3.4%
> functions..:  100.0   100.0   +0.0%
> branches...:   63.5    75.9  +12.4%

Nathan has given me his blessing, so as I could look at this patch and
put my hands on it.  I have spotted one bug, and things that could be
done a bit better.  Please see below.

As far as I am concerned there is a mistake with bms_overlap_list().
This API checks a Bitmapset with a list of integers, but you have
coded things so as it checks for a list of Bitmapset.  I think that
test_bms_overlap_list() should take in input a int4 array, then the
code in charge of the conversion from the array to the list needs to
be tweaked a little bit.  With the current code, we could get random
failures with negative members included in a Bitmapset, depending on
what's on the stack.

I am not really convinced about the value brought by bms_values() now
that the output is generated for all the SQL functions using the
"decode" and "encode" routines (which may be actually less confusing
if renamed as "bms_to_text" and "text_to_bms", but that's a personal
take), because this results in doing an extra round-trip between text
and Bitmapset.  The tests don't rely much on NULL as input to
translate that as "(b)".  bmsToString() is just a direct call to what
the decode and encode routines do.

Same remark about test_bms_from_array(), that mimicks nodeRead(), in
charge of doing a int4[] -> Bitmapset conversion.  Wouldn't it be
simpler to just use "(b 1 2 3)" in input and let the decode/encode
routines do the job?

Let's replace the test descriptions in the queries that we know have
an individual result by comments.  For example all the "Range
operations".  These descriptions are useful in the UNION queries,
where we want to rely on a set of Bitmapsets for multiple operations,
to be able to know the description of the result.  For individual
queries, this bloats the output.

The numbering in the tests are not that useful to have.  If the test
suite is updated, that would mean more updates required if we add a
new block in the middle.

+SELECT 'overlapping ranges' as test,
+       bms_values(test_bms_union(
+           test_bms_add_range(NULL, 50, 150),
+           test_bms_add_range(NULL, 100, 200)

The output generated by this query leads to a bitmapset with 200
members.  It would be simpler and more readable to reduce these
ranges, or is there a reason for these to be large?  Same remark for
"difference subtraction edge case" with its 50 members.

+SELECT 'add_range large range' as test, bms_values(test_bms_add_range(NULL, 0, 1000)) as result;

This one also is funky with its output of 1000 elements.  Any reason
this is justified in terms of coverage?  If yes, we could use some
substr() calls to get the end and the beginning of the output
generated, perhaps to reduce the output of the whole, or the length,
or something like that relevant enough to validate the output.  There
is a second test a bit down that uses a range of [1000,1100], as well.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 19 2025, at 2:36 am, Michael Paquier <michael@paquier.xyz> wrote:

> On Thu, Sep 18, 2025 at 02:50:45PM -0400, Greg Burd wrote:
>> Thanks for all the feedback and time spent reviewing this patch.  I
>> switched out the encode/decode functions to use the nodeToString() and
>> stringToNode() functions and change all the SQL testing function
>> signatures to TEXT from BYTEA.  This exercises more code and that's good
>> for testing purposes.  I took out the code that checks the hash values,
>> I can't think of a reason that should cause a future failure should that
>> change and output different values.  I re-integrated the earlier random
>> testing function along with the newer code.  I'm not sure this buys us
>> much if anything.
>> 
>> coverage:      HEAD      v7
>> lines......:   87.1    90.5   +3.4%
>> functions..:  100.0   100.0   +0.0%
>> branches...:   63.5    75.9  +12.4%
> 
> Nathan has given me his blessing, so as I could look at this patch and
> put my hands on it.  I have spotted one bug, and things that could be
> done a bit better.  Please see below.

Hey Michael,

Thank you for taking the initiative and reviewing the code.

> As far as I am concerned there is a mistake with bms_overlap_list().
> This API checks a Bitmapset with a list of integers, but you have
> coded things so as it checks for a list of Bitmapset.  I think that
> test_bms_overlap_list() should take in input a int4 array, then the
> code in charge of the conversion from the array to the list needs to
> be tweaked a little bit.  With the current code, we could get random
> failures with negative members included in a Bitmapset, depending on
> what's on the stack.

Apologies, you're right.  I'm not sure how I went astray on that, but
it's fixed now.

> I am not really convinced about the value brought by bms_values() now
> that the output is generated for all the SQL functions using the
> "decode" and "encode" routines (which may be actually less confusing
> if renamed as "bms_to_text" and "text_to_bms", but that's a personal
> take), because this results in doing an extra round-trip between text
> and Bitmapset.

Agreed, renamed and removed.  New macros are:

/* Encode/Decode to/from TEXT and Bitmapset */
#define BITMAPSET_TO_TEXT(bms) (text *) CStringGetTextDatum(nodeToString((bms)))
#define TEXT_TO_BITMAPSET(str) (Bitmapset *)
stringToNode(TextDatumGetCString((Datum) (str)))

> The tests don't rely much on NULL as input to
> translate that as "(b)".  bmsToString() is just a direct call to what
> the decode and encode routines do.

Yes, agreed.

> Same remark about test_bms_from_array(), that mimicks nodeRead(), in
> charge of doing a int4[] -> Bitmapset conversion.  Wouldn't it be
> simpler to just use "(b 1 2 3)" in input and let the decode/encode
> routines do the job?

That makes a lot of sense now that things are all string to/from the
Bitmapset node representation.  Done.

> Let's replace the test descriptions in the queries that we know have
> an individual result by comments.  For example all the "Range
> operations".  These descriptions are useful in the UNION queries,
> where we want to rely on a set of Bitmapsets for multiple operations,
> to be able to know the description of the result.  For individual
> queries, this bloats the output.

Okay, done.

> The numbering in the tests are not that useful to have.  If the test
> suite is updated, that would mean more updates required if we add a
> new block in the middle.

Agreed, removed.

> +SELECT 'overlapping ranges' as test,
> +       bms_values(test_bms_union(
> +           test_bms_add_range(NULL, 50, 150),
> +           test_bms_add_range(NULL, 100, 200)
> 
> The output generated by this query leads to a bitmapset with 200
> members.  It would be simpler and more readable to reduce these
> ranges, or is there a reason for these to be large?  Same remark for
> "difference subtraction edge case" with its 50 members.

Some of these were targeted at corner cases to increase coverage.  I'll
review and consider ways to either a) eliminate them, or b) shorten the output.

> +SELECT 'add_range large range' as test,
> bms_values(test_bms_add_range(NULL, 0, 1000)) as result;
> 
> This one also is funky with its output of 1000 elements.  Any reason
> this is justified in terms of coverage?  If yes, we could use some
> substr() calls to get the end and the beginning of the output
> generated, perhaps to reduce the output of the whole, or the length,
> or something like that relevant enough to validate the output.  There
> is a second test a bit down that uses a range of [1000,1100], as well.

I used length() and recorded the expected output in the test.  That
seems to a) reduce noise in the output and b) remain deterministic(-ish).

> --
> Michael

I think I've incorporated your feedback which did indeed make the patch
more readable/crisp and maintainable, thank you.

coverage:    HEAD    v7    v8
lines......:  87.1  90.5  92.2
functions..: 100.0 100.0 100.0
branches...:  63.5  75.9  76.2

best.

-greg

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Fri, Sep 19, 2025 at 02:48:43PM -0400, Greg Burd wrote:
> I think I've incorporated your feedback which did indeed make the patch
> more readable/crisp and maintainable, thank you.
>
> coverage:    HEAD    v7    v8
> lines......:  87.1  90.5  92.2

bitmap_match() had some duplicated tests.

bms_replace_members() was not covered, and a function was added to the
module.  I get up to 93.4% for the lines, once added.

+PG_FUNCTION_INFO_V1(test_bms_from_array);
+PG_FUNCTION_INFO_V1(test_bms_to_array);
These two were still declared, are not required.

There were a couple of functions that acted as dead code in the
module, covered in the tree.  As we are trying to cover the gap I
think that we should just do the whole thing.  Here are the functions:
- is_empty
- subset_compare
- get_singleton_member
- nonempty_difference
- member_index
- join.  The test C function had a mistake, because either input may
be modified or updated by bms_join().  The code was freeing the right
input, leading to errors in the node output function when tested.

Mixing the CTEs and the UNION ALL would have been nice if the queries
in the UNIONs relied on more than one input value, but none of them
did that, so I have removed this part and made the tests leaner.  That
feels easier for the eye.

Testing on equality for the hash functions should be OK, so I have
kept these, including the NULL/0 cases.

A few incorrect things related to the style, fixed on the way, like an
incorrect copyright notice in meson.build, the top of test_bitmapset.c
was largely incorrect.  It seems like these were copy-pasted from
elsewhere.

Another thing was testing with -DREALLOCATE_BITMAPSETS, which I've
done.  A couple of code paths related to the test C functions don't
test for NULL input, which is still OK as these don't impact the
internal tests or the backend coverage, so I've left these out.

The result I had was good enough, so applied.  The CI was OK, the
buildfarm may have a different opinion.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 22 2025, at 3:57 am, Michael Paquier <michael@paquier.xyz> wrote:

> On Fri, Sep 19, 2025 at 02:48:43PM -0400, Greg Burd wrote:
>> I think I've incorporated your feedback which did indeed make the patch
>> more readable/crisp and maintainable, thank you.
>> 
>> coverage:    HEAD    v7    v8
>> lines......:  87.1  90.5  92.2

Michael,

Thank you for your time and help.

> bitmap_match() had some duplicated tests.
> 
> bms_replace_members() was not covered, and a function was added to the
> module.  I get up to 93.4% for the lines, once added.
> 
> +PG_FUNCTION_INFO_V1(test_bms_from_array);
> +PG_FUNCTION_INFO_V1(test_bms_to_array);
> These two were still declared, are not required.
> 
> There were a couple of functions that acted as dead code in the
> module, covered in the tree.  As we are trying to cover the gap I
> think that we should just do the whole thing.  Here are the functions: 
> - is_empty
> - subset_compare
> - get_singleton_member
> - nonempty_difference
> - member_index
> - join.  The test C function had a mistake, because either input may
> be modified or updated by bms_join().  The code was freeing the right
> input, leading to errors in the node output function when tested.

All great ideas, apologies for the oversights.

> Mixing the CTEs and the UNION ALL would have been nice if the queries
> in the UNIONs relied on more than one input value, but none of them
> did that, so I have removed this part and made the tests leaner.  That
> feels easier for the eye.

Agreed, thank you.

> Testing on equality for the hash functions should be OK, so I have
> kept these, including the NULL/0 cases.

Sounds good to me.

> A few incorrect things related to the style, fixed on the way, like an
> incorrect copyright notice in meson.build, the top of test_bitmapset.c
> was largely incorrect.  It seems like these were copy-pasted from
> elsewhere.

Yes, copied from the radixtree test module.

> Another thing was testing with -DREALLOCATE_BITMAPSETS, which I've
> done.

That seems like a good idea, I thought about doing that as well.

> A couple of code paths related to the test C functions don't
> test for NULL input, which is still OK as these don't impact the
> internal tests or the backend coverage, so I've left these out.

Fair.

> The result I had was good enough, so applied.  The CI was OK, the
> buildfarm may have a different opinion.
> --
> Michael

Thanks for the help getting this done, I really appreciate it.  So far
the buildfarm seems okay. :)

best.

-greg



Re: [PATCH] Add tests for Bitmapset

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> The result I had was good enough, so applied.  The CI was OK, the
> buildfarm may have a different opinion.

This patch seems to be rather full of arbitrary casts to or
from Datum, which is no longer okay.  You need to be using
the appropriate conversion macros, such as PointerGetDatum.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2025-09-23%2002%3A12%3A36

            regards, tom lane



Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Tue, Sep 23, 2025 at 01:43:47AM -0400, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
>> The result I had was good enough, so applied.  The CI was OK, the
>> buildfarm may have a different opinion.
>
> This patch seems to be rather full of arbitrary casts to or
> from Datum, which is no longer okay.  You need to be using
> the appropriate conversion macros, such as PointerGetDatum.
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2025-09-23%2002%3A12%3A36

Right, this can ve reproduced with a -m32 added to gcc.

I don't see a need for a Datum manipulation in these conversion
macros, as we already allocate the results to and from "text"
before/after using the GETARG or RETURN macros.  Using directly
text_to_cstring() and cstring_to_text() takes care of the warnings, as
well.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Tue, Sep 23, 2025 at 01:43:47AM -0400, Tom Lane wrote:
>> This patch seems to be rather full of arbitrary casts to or
>> from Datum, which is no longer okay.  You need to be using
>> the appropriate conversion macros, such as PointerGetDatum.

> Right, this can ve reproduced with a -m32 added to gcc.

Curiously, I don't see these warnings on 32-bit FreeBSD
(with clang version 19.1.7).  But I tested your patch on
mamba's host and confirmed that it makes that gcc happy.

> I don't see a need for a Datum manipulation in these conversion
> macros, as we already allocate the results to and from "text"
> before/after using the GETARG or RETURN macros.  Using directly
> text_to_cstring() and cstring_to_text() takes care of the warnings, as
> well.

Yeah, this is better, but I think the new macros could use a bit
more parenthesis-twiddling:

+#define BITMAPSET_TO_TEXT(bms) cstring_to_text(nodeToString((bms)))

The extra parens around "bms" are surely unnecessary.  (If they were
necessary, it could only be because nodeToString was a macro that
failed to parenthesize its argument where needed.  But that would
be a bug to fix in nodeToString, not here.)

+#define TEXT_TO_BITMAPSET(str) (Bitmapset *) stringToNode(text_to_cstring(str))

Here, on the other hand, I think an extra outer set of parens
would be a good idea, ie

+#define TEXT_TO_BITMAPSET(str) ((Bitmapset *) stringToNode(text_to_cstring(str)))

It may be that the C cast binds tightly enough that this cannot
be a problem, but I don't think assuming that is project style.

            regards, tom lane



Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 23 2025, at 12:46 pm, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Michael Paquier <michael@paquier.xyz> writes:
>> On Tue, Sep 23, 2025 at 01:43:47AM -0400, Tom Lane wrote:
>>> This patch seems to be rather full of arbitrary casts to or
>>> from Datum, which is no longer okay.  You need to be using
>>> the appropriate conversion macros, such as PointerGetDatum.

Tom, thanks for pointing out the oversight.  Apologies, I should have
tried that out in my local tests.

>> Right, this can ve reproduced with a -m32 added to gcc.
> 
> Curiously, I don't see these warnings on 32-bit FreeBSD
> (with clang version 19.1.7).  But I tested your patch on
> mamba's host and confirmed that it makes that gcc happy.
> 
>> I don't see a need for a Datum manipulation in these conversion
>> macros, as we already allocate the results to and from "text"
>> before/after using the GETARG or RETURN macros.  Using directly
>> text_to_cstring() and cstring_to_text() takes care of the warnings, as
>> well.
> 
> Yeah, this is better, but I think the new macros could use a bit
> more parenthesis-twiddling:
> 
> +#define BITMAPSET_TO_TEXT(bms) cstring_to_text(nodeToString((bms)))

Agreed, that was my mistake.

> The extra parens around "bms" are surely unnecessary.  (If they were
> necessary, it could only be because nodeToString was a macro that
> failed to parenthesize its argument where needed.  But that would
> be a bug to fix in nodeToString, not here.)
> 
> +#define TEXT_TO_BITMAPSET(str) (Bitmapset *) stringToNode(text_to_cstring(str))
> 
> Here, on the other hand, I think an extra outer set of parens
> would be a good idea, ie
> 
> +#define TEXT_TO_BITMAPSET(str) ((Bitmapset *) stringToNode(text_to_cstring(str)))

Sounds reasonable to me.

> It may be that the C cast binds tightly enough that this cannot
> be a problem, but I don't think assuming that is project style.
> 
>             regards, tom lane

patch attached, best.

-greg

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Tue, Sep 23, 2025 at 02:26:19PM -0400, Greg Burd wrote:
> patch attached, best.

All that seems to work corrently here, so done.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 23 2025, at 7:23 pm, Michael Paquier <michael@paquier.xyz> wrote:

> On Tue, Sep 23, 2025 at 02:26:19PM -0400, Greg Burd wrote:
>> patch attached, best.
> 
> All that seems to work corrently here, so done.
> --
> Michael

Thanks Michael, Tom, for the help getting this into shape and in the tree.

best.

-greg



Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Wed, Sep 24, 2025 at 07:39:59AM -0400, Greg Burd wrote:
> Thanks Michael, Tom, for the help getting this into shape and in the tree.

By the way, Greg, do you think that we should aim for a state where we
are closer to completion?  We have the module up to the point where we
are in pretty good shape, with most things and the infrastructure done
but it can be improved a bit more, as well.

Based on the information provided by the coverage report at
https://coverage.postgresql.org/src/backend/nodes/bitmapset.c.gcov.html,
we still have the following things:
- bms_equal for different word counts
- bms_union, bms_nonempty_difference, bms_is_subset and bms_intersect
with shorter word counts.
- bms_different with different word counts
- A couple more cases with bms_subset_compare
- bms_member_index and word counts
- bms_overlap_list with negative number in input list.
- bms_singleton_member ERROR with empty input.
- bms_get_singleton_member with NULL input
- bms_del_member with word counts
- bms_replace_members and repalloc case
- bms_add_range, bms_join and bms_del_members, more word count cases
- bms_prev_member and the prevbit business

There is not much we can do with the random function, still we could
do something about the NULL paths in the internal functions:
https://coverage.postgresql.org/src/test/modules/test_bitmapset/test_bitmapset.c.gcov.html

The coverage of the latter matters less than the coverage of the
former, of course.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 24 2025, at 7:28 pm, Michael Paquier <michael@paquier.xyz> wrote:

> On Wed, Sep 24, 2025 at 07:39:59AM -0400, Greg Burd wrote:
>> Thanks Michael, Tom, for the help getting this into shape and in the tree.
> 
> By the way, Greg, do you think that we should aim for a state where we
> are closer to completion?

Hi Michael,

We've come this far, why not go all the way? :)

> We have the module up to the point where we
> are in pretty good shape, with most things and the infrastructure done
> but it can be improved a bit more, as well.

Agreed.

> Based on the information provided by the coverage report at
> https://coverage.postgresql.org/src/backend/nodes/bitmapset.c.gcov.html,
> we still have the following things:
> - bms_equal for different word counts
> - bms_union, bms_nonempty_difference, bms_is_subset and bms_intersect
> with shorter word counts.
> - bms_different with different word counts
> - A couple more cases with bms_subset_compare
> - bms_member_index and word counts
> - bms_overlap_list with negative number in input list.
> - bms_singleton_member ERROR with empty input.
> - bms_get_singleton_member with NULL input
> - bms_del_member with word counts
> - bms_replace_members and repalloc case
> - bms_add_range, bms_join and bms_del_members, more word count cases
> - bms_prev_member and the prevbit business

Thanks for the detailed analysis, I tried to address each of these in
the patch.

> There is not much we can do with the random function, still we could
> do something about the NULL paths in the internal functions:
> https://coverage.postgresql.org/src/test/modules/test_bitmapset/test_bitmapset.c.gcov.html

Agreed.

> The coverage of the latter matters less than the coverage of the
> former, of course.

Agreed.

> --
> Michael


test_bitmapset.c
lines......: 98.9%
functions..: 100%
branches...: 82.6%

Here I removed a bit of code I thought was unnecessary and added a few
test cases.  The coverage is better overall, not much to say about it.


bitmapset.c
lines......: 99.8%
functions..: 100%
branches...: 89.4%

A few branches were really tricky, my (least) favorite was the
bms_subset_compare().  To get to the first BMS_DIFFERENT return you have
to have a test case where the shorter bitmap is long enough to force two
or more iterations of the loop and in the first iteration the first word
in a needs to be a subset of b and then for the second word the opposite
where the second word of b is a subset of a and yet there can be no
overlapping bits in the two sets and the same number of words in both sets.

test_bms_subset_compare('(b 1 64 65)', '(b 1 2 64)') -> BMS_DIFFERENT

Technically the '(b 64 200)', '(b 1 201)' test case is also
BMS_DIFFERENT and so redundant, but I was so happy to find the test case
for this branch I kept them both.

best.

-greg

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Sat, Sep 27, 2025 at 10:03:14AM -0400, Greg Burd wrote:
>
> On Sep 24 2025, at 7:28 pm, Michael Paquier <michael@paquier.xyz> wrote:
>
> > On Wed, Sep 24, 2025 at 07:39:59AM -0400, Greg Burd wrote:
> >> Thanks Michael, Tom, for the help getting this into shape and in the tree.
> >
> > By the way, Greg, do you think that we should aim for a state where we
> > are closer to completion?
>
> Hi Michael,
>
> We've come this far, why not go all the way? :)
>
> > We have the module up to the point where we
> > are in pretty good shape, with most things and the infrastructure done
> > but it can be improved a bit more, as well.
>
> Agreed.
>
> > Based on the information provided by the coverage report at
> > https://coverage.postgresql.org/src/backend/nodes/bitmapset.c.gcov.html,
> > we still have the following things:
> > - bms_equal for different word counts
> > - bms_union, bms_nonempty_difference, bms_is_subset and bms_intersect
> > with shorter word counts.
> > - bms_different with different word counts
> > - A couple more cases with bms_subset_compare
> > - bms_member_index and word counts
> > - bms_overlap_list with negative number in input list.
> > - bms_singleton_member ERROR with empty input.
> > - bms_get_singleton_member with NULL input
> > - bms_del_member with word counts
> > - bms_replace_members and repalloc case
> > - bms_add_range, bms_join and bms_del_members, more word count cases
> > - bms_prev_member and the prevbit business
>
> Thanks for the detailed analysis, I tried to address each of these in
> the patch.
>
> > There is not much we can do with the random function, still we could
> > do something about the NULL paths in the internal functions:
> > https://coverage.postgresql.org/src/test/modules/test_bitmapset/test_bitmapset.c.gcov.html
>
> Agreed.
>
> > The coverage of the latter matters less than the coverage of the
> > former, of course.
>
> Agreed.
> test_bitmapset.c
> lines......: 98.9%
> functions..: 100%
> branches...: 82.6%
>
> Here I removed a bit of code I thought was unnecessary and added a few
> test cases.  The coverage is better overall, not much to say about it.
>
>
> bitmapset.c
> lines......: 99.8%
> functions..: 100%
> branches...: 89.4%

No need for three tests for the word count case in bms_del_member(),
only one is enough, but I've let them as they you have proposed, as
you wanted to check more positioning with the members.  These are
cheap, that's fine.

Added a comment for bms_del_members() for the two cases that trigger
word count changes.

The test changed with test_bms_difference() is indeed long, but not
that bad with 140-ish characters, so left it as-is.

Good catch with bms_del_members() that was missing.  There are so many
APIs that it's easy to miss one.

The changes in test_bms_get_singleton_member() don't seem adapted,
because we also want to test if the default value given by the
function caller is changed, bms_get_singleton_member() returning a
boolean status.  The patch did not use arg1 at all.  Also, if arg0 is
NULL, let's return the input value.

> A few branches were really tricky, my (least) favorite was the
> bms_subset_compare().  To get to the first BMS_DIFFERENT return you have
> to have a test case where the shorter bitmap is long enough to force two
> or more iterations of the loop and in the first iteration the first word
> in a needs to be a subset of b and then for the second word the opposite
> where the second word of b is a subset of a and yet there can be no
> overlapping bits in the two sets and the same number of words in both sets.
>
> test_bms_subset_compare('(b 1 64 65)', '(b 1 2 64)') -> BMS_DIFFERENT
>
> Technically the '(b 64 200)', '(b 1 201)' test case is also
> BMS_DIFFERENT and so redundant, but I was so happy to find the test case
> for this branch I kept them both.

I can see that you have expanded on these quite a bit, impressive.
One thing that was confusing is that some of the tests already
existed, so I've slightly reworded things to reduce the diffs.  Per my
measurements, we are at more than 99% in the module and bitmapset.c
when only running make check in the module, which is nice
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 29 2025, at 2:20 am, Michael Paquier <michael@paquier.xyz> wrote:

> On Sat, Sep 27, 2025 at 10:03:14AM -0400, Greg Burd wrote:
>> 
>> On Sep 24 2025, at 7:28 pm, Michael Paquier <michael@paquier.xyz> wrote:
>> 
>> > On Wed, Sep 24, 2025 at 07:39:59AM -0400, Greg Burd wrote:
>> >> Thanks Michael, Tom, for the help getting this into shape and in
>> the tree.
>> > 
>> > By the way, Greg, do you think that we should aim for a state where we
>> > are closer to completion?
>> 
>> Hi Michael,
>> 
>> We've come this far, why not go all the way? :)
>> 
>> > We have the module up to the point where we
>> > are in pretty good shape, with most things and the infrastructure done
>> > but it can be improved a bit more, as well.
>> 
>> Agreed.
>> 
>> > Based on the information provided by the coverage report at
>> > https://coverage.postgresql.org/src/backend/nodes/bitmapset.c.gcov.html,
>> > we still have the following things:
>> > - bms_equal for different word counts
>> > - bms_union, bms_nonempty_difference, bms_is_subset and bms_intersect
>> > with shorter word counts.
>> > - bms_different with different word counts
>> > - A couple more cases with bms_subset_compare
>> > - bms_member_index and word counts
>> > - bms_overlap_list with negative number in input list.
>> > - bms_singleton_member ERROR with empty input.
>> > - bms_get_singleton_member with NULL input
>> > - bms_del_member with word counts
>> > - bms_replace_members and repalloc case
>> > - bms_add_range, bms_join and bms_del_members, more word count cases
>> > - bms_prev_member and the prevbit business
>> 
>> Thanks for the detailed analysis, I tried to address each of these in
>> the patch.
>> 
>> > There is not much we can do with the random function, still we could
>> > do something about the NULL paths in the internal functions:
>> > https://coverage.postgresql.org/src/test/modules/test_bitmapset/test_bitmapset.c.gcov.html
>> 
>> Agreed.
>> 
>> > The coverage of the latter matters less than the coverage of the
>> > former, of course.
>> 
>> Agreed.
>> test_bitmapset.c
>> lines......: 98.9%
>> functions..: 100%
>> branches...: 82.6%
>> 
>> Here I removed a bit of code I thought was unnecessary and added a few
>> test cases.  The coverage is better overall, not much to say about it.
>> 
>> 
>> bitmapset.c
>> lines......: 99.8%
>> functions..: 100%
>> branches...: 89.4%

Hello Michael,

> No need for three tests for the word count case in bms_del_member(),
> only one is enough, but I've let them as they you have proposed, as
> you wanted to check more positioning with the members.  These are
> cheap, that's fine.

Yes, redundant.  Thanks for leaving them.

> Added a comment for bms_del_members() for the two cases that trigger
> word count changes.

Thanks, I considered adding comments to cases that were more subtle but
didn't find the time to review/add them.  I appreciate that you added that.

> The test changed with test_bms_difference() is indeed long, but not
> that bad with 140-ish characters, so left it as-is.

Thanks.

> Good catch with bms_del_members() that was missing.  There are so many
> APIs that it's easy to miss one.

Indeed, but it's covered now.

> The changes in test_bms_get_singleton_member() don't seem adapted,
> because we also want to test if the default value given by the
> function caller is changed, bms_get_singleton_member() returning a
> boolean status.  The patch did not use arg1 at all.  Also, if arg0 is
> NULL, let's return the input value.

Doh, I see what you mean.  Thanks for the adaptation.

>> A few branches were really tricky, my (least) favorite was the
>> bms_subset_compare().  To get to the first BMS_DIFFERENT return you have
>> to have a test case where the shorter bitmap is long enough to force two
>> or more iterations of the loop and in the first iteration the first word
>> in a needs to be a subset of b and then for the second word the opposite
>> where the second word of b is a subset of a and yet there can be no
>> overlapping bits in the two sets and the same number of words in both sets.
>> 
>> test_bms_subset_compare('(b 1 64 65)', '(b 1 2 64)') -> BMS_DIFFERENT
>> 
>> Technically the '(b 64 200)', '(b 1 201)' test case is also
>> BMS_DIFFERENT and so redundant, but I was so happy to find the test case
>> for this branch I kept them both.
> 
> I can see that you have expanded on these quite a bit, impressive.
> One thing that was confusing is that some of the tests already
> existed, so I've slightly reworded things to reduce the diffs.

Excellent, thanks.

> Per my
> measurements, we are at more than 99% in the module and bitmapset.c
> when only running make check in the module, which is nice

Yes, I think that does it.  Thanks so much for the guidance and help
taking this to its logical end.  I greatly appreciate your guidance and time.

> --
> Michael


best.

-greg



Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Mon, Sep 29, 2025 at 06:27:01AM -0400, Greg Burd wrote:
> Yes, I think that does it.  Thanks so much for the guidance and help
> taking this to its logical end.  I greatly appreciate your guidance and time.

The latest coverage report confirms 100% for bitmapset.c

The only two code paths I'm seeing as not covered are:
- test_bms_add_range() for a bms_free() case.
- test_bms_make_singleton() on NULL input, because the function is
created as strict:
=# select proisstrict from pg_proc
     where proname = 'test_bms_make_singleton';
 proisstrict
-------------
 t
(1 row)

I'm not planning to bother about these, so we are good here.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
David Rowley
Дата:
On Tue, 30 Sept 2025 at 12:40, Michael Paquier <michael@paquier.xyz> wrote:
> The only two code paths I'm seeing as not covered are:
> - test_bms_add_range() for a bms_free() case.

One question about that:  In cases like [1],  what's the reason that
many of the bms_free() calls check if the set is NULL before calling
the function?

NULL is a valid Bitmapset, so I don't really see the need to check for
an empty set before calling bms_free().  If those were removed, then
you'd not have to care about the coverage of that line.

David

[1] https://coverage.postgresql.org/src/test/modules/test_bitmapset/test_bitmapset.c.gcov.html#699



Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Tue, Sep 30, 2025 at 01:19:05PM +1300, David Rowley wrote:
> NULL is a valid Bitmapset, so I don't really see the need to check for
> an empty set before calling bms_free().  If those were removed, then
> you'd not have to care about the coverage of that line.

Yeah, we could just remove them as you are suggesting, as well.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 29 2025, at 8:51 pm, Michael Paquier <michael@paquier.xyz> wrote:

> On Tue, Sep 30, 2025 at 01:19:05PM +1300, David Rowley wrote:
>> NULL is a valid Bitmapset, so I don't really see the need to check for
>> an empty set before calling bms_free().  If those were removed, then
>> you'd not have to care about the coverage of that line.

Good point, I agree.  Thanks for reporting/reviewing David.

> Yeah, we could just remove them as you are suggesting, as well.
> --
> Michael

Thank you both, patch attached.

best.

-greg

Вложения

Re: [PATCH] Add tests for Bitmapset

От
David Rowley
Дата:
On Tue, 30 Sept 2025 at 22:30, Greg Burd <greg@burd.me> wrote:
> Thank you both, patch attached.

Patch looks good to me.

Just while reviewing, I was confused at the following code in
test_bms_add_range().

    /* Check for invalid range */
    if (upper < lower)
    {
        bms_free(bms);
        PG_RETURN_NULL();
    }

That seems wrong. You should just return the input set in that case, not NULL.

This results in:

postgres=# select test_bms_add_range('(b 1)', 2, 5); -- ok.
 test_bms_add_range
--------------------
 (b 1 2 3 4 5)
(1 row)

postgres=# select test_bms_add_range('(b 1)', 5, 2); -- wrong results
 test_bms_add_range
--------------------

(1 row)

I'd expect the last one to return '(b 1)', effectively the input set unmodified.

If I remove the code quoted above, I also see something else unexpected:

postgres=# select test_bms_add_range('(b)', 5, 2);
 test_bms_add_range
--------------------
 <>
(1 row)

Now, you might blame me for that as I see you've done things like:

    if (bms_is_empty(bms))
       PG_RETURN_NULL();

in other places, but it's not very consistent as I can get the <> in
other locations:

postgres=# select test_bms_add_members('(b)', '(b)');
 test_bms_add_members
----------------------
 <>
(1 row)

It seems to me that you could save some code and all this
inconsistency by just making <> the standard way to represent an empty
Bitmapset. Or if you really want to keep the SQL NULLs, you could make
a helper macro that handles that consistently.

For me, I think stripping as much logic out of the test functions as
possible is a good way of doing things. I expect you really want to
witness the behaviour of bitmapset.c, not some anomaly of
test_bitmapset.c.

David



Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Sep 30 2025, at 6:29 am, David Rowley <dgrowleyml@gmail.com> wrote:

> On Tue, 30 Sept 2025 at 22:30, Greg Burd <greg@burd.me> wrote:
>> Thank you both, patch attached.
> 
> Patch looks good to me.
> 
> Just while reviewing, I was confused at the following code in
> test_bms_add_range().
> 
>    /* Check for invalid range */
>    if (upper < lower)
>    {
>        bms_free(bms);
>        PG_RETURN_NULL();
>    }
> 
> That seems wrong. You should just return the input set in that case,
> not NULL.

Agreed, I should have caught this inconsistency earlier on.

> This results in:
> 
> postgres=# select test_bms_add_range('(b 1)', 2, 5); -- ok.
> test_bms_add_range
> --------------------
> (b 1 2 3 4 5)
> (1 row)
> 
> postgres=# select test_bms_add_range('(b 1)', 5, 2); -- wrong results
> test_bms_add_range
> --------------------
> 
> (1 row)
> 
> I'd expect the last one to return '(b 1)', effectively the input set unmodified.
> 
> If I remove the code quoted above, I also see something else unexpected:
> 
> postgres=# select test_bms_add_range('(b)', 5, 2);
> test_bms_add_range
> --------------------
> <>
> (1 row)
> 
> Now, you might blame me for that as I see you've done things like:
> 
>    if (bms_is_empty(bms))
>       PG_RETURN_NULL();
> 
> in other places, but it's not very consistent as I can get the <> in
> other locations:
> 
> postgres=# select test_bms_add_members('(b)', '(b)');
> test_bms_add_members
> ----------------------
> <>
> (1 row)
> 
> It seems to me that you could save some code and all this
> inconsistency by just making <> the standard way to represent an empty
> Bitmapset. Or if you really want to keep the SQL NULLs, you could make
> a helper macro that handles that consistently.

nodeToString() will turn NULL into '<>' because it has no way to know
that we're referring to a NULL/empty Bitmapset, but that's not what we'd
ideally like to return in those cases because NULL Bitmapsets are in
fact a thing.  The standard/consistent representation for a NULL
Bitmapset encoded by nodeToString() is '(b)' so I'll update to be more
consistent across tests.

> For me, I think stripping as much logic out of the test functions as
> possible is a good way of doing things. I expect you really want to
> witness the behaviour of bitmapset.c, not some anomaly of
> test_bitmapset.c.

You are 100% correct, I've stripped away as much as possible so as to
push the responsibility for results down to the bitmapset.c code as it
should be.
 
> David

v2 attached.

best.

-greg

Вложения

Re: [PATCH] Add tests for Bitmapset

От
David Rowley
Дата:
On Wed, 1 Oct 2025 at 01:38, Greg Burd <greg@burd.me> wrote:
> v2 attached.

Thanks.

I looked at this and I think we can do a bit better to simplify the
code and standardise what we do when faced with bogus inputs.

Here's what I've done:

1. Added helper macros PG_ARG_GETBITMAPSET() and
PG_RETURN_BITMAPSET_AS_TEXT(). These themselves save a few hundred
lines of code and make the functions easier to read.
2. Standardised that the test function returns NULL when it receives
invalid input. e.g if we aim to pass an "int" to the underlying
Bitmapset function, if someone calls the test function with an SQL
NULL, just return NULL rather than trying to come up with some logic
on what we should do about that.
3. I couldn't quite understand the extra parameter to
test_bms_get_singleton_member() so I ended up removing it and making
the test function return -1 when the set isn't a singleton. I didn't
search for discussion about that so I might have missed the specific
reason it was done that way.
4. Wrote some comments to try to communicate the standards defined.
This aims to offer guidance to callers of these functions and anyone
hacking on the module itself in the future.

git diff --stat looks like:

 4 files changed, 233 insertions(+), 505 deletions(-)

Things not done:

1. I didn't update the expected test results.
2. I didn't rationalise the tests. I feel that there's no point in the
tests that purposefully pass invalid input to the module's test
function. It appears that's aiming to increase coverage on the test
module's C code, which I don't quite understand. In any case, due to
the helper macros, I don't think removing these will lower the
per-source-file line coverage of the module itself.

I've not checked my work very thoroughly. I can do that, or you can if
you and Michael are ok with the proposed ideas. I'm also not aiming to
commandeer anything here. Writing the patch was just the easiest way
to communicate the ideas I had.

I also wonder how worthwhile it is to try to free the Bitmapsets. I'd
have expected these allocations to be in a pretty short-lived memory
context. I count 44 calls to bms_free().

David

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Wed, Oct 01, 2025 at 12:00:39PM +1300, David Rowley wrote:
> 1. Added helper macros PG_ARG_GETBITMAPSET() and
> PG_RETURN_BITMAPSET_AS_TEXT(). These themselves save a few hundred
> lines of code and make the functions easier to read.

+#define PG_RETURN_BITMAPSET_AS_TEXT(bms) \
+   do { \
+       text       *result = BITMAPSET_TO_TEXT(bms); \
+       bms_free(bms); \
+       PG_RETURN_TEXT_P(result); \
+   } while (0);

This is also something I've considered as embedding in a macro, and
discarded it.  The reason was code readability, because it gives to
the reader a quick knowledge of what can be freed once a call to a
specific bitmapset.c routine is done, acting as a kind of reference
template.  Here, the free call knowledge gets hidden.  test_bms_copy()
becomes more confusing to me.  This argument comes down to one's taste
and philosophy on the matter of test modules.  These are for me tools
of coverage, but also can work as reference templates when calling an
internal routine because we also document some of its implied
behaviors, like the inputs recycled.

> 2. Standardised that the test function returns NULL when it receives
> invalid input. e.g if we aim to pass an "int" to the underlying
> Bitmapset function, if someone calls the test function with an SQL
> NULL, just return NULL rather than trying to come up with some logic
> on what we should do about that.

+static const char *emptyset = "(b)";

Not sure that I'm on board with hardcoding this knowledge in the test
module.  But well, if you feel strongly about it, I'm not going to put
a fight for it.

> 3. I couldn't quite understand the extra parameter to
> test_bms_get_singleton_member() so I ended up removing it and making
> the test function return -1 when the set isn't a singleton. I didn't
> search for discussion about that so I might have missed the specific
> reason it was done that way.

It is not very far.  In short, I am caring about the status returned
by bms_get_singleton_member(), then translated in the output of the
SQL function based on the default given in input:
https://www.postgresql.org/message-id/aNolInqSzt1d1338@paquier.xyz

Perhaps not the most elegant solution, but simpler than returning a
setof without changing the meaning of the tests.  So we could return
the status directly or use the default value from the input, not doing
any of these actions reduces what's reported back at SQL level.

> 4. Wrote some comments to try to communicate the standards defined.
> This aims to offer guidance to callers of these functions and anyone
> hacking on the module itself in the future.

+/* Helper macro to fetch Text parameters as Bitmapsets. SQL-NULL means empty set */
+#define PG_ARG_GETBITMAPSET(n) \
+   (PG_ARGISNULL(n) ? NULL : TEXT_TO_BITMAPSET(PG_GETARG_TEXT_PP(n)))

True that embedding the PG_ARGISNULL() call may be better in a macro,
as passing NULL values to the internal routines is OK.

> I also wonder how worthwhile it is to try to free the Bitmapsets. I'd
> have expected these allocations to be in a pretty short-lived memory
> context. I count 44 calls to bms_free().

I've wondered about that as well.  However, I have concluded to leave
the free calls arounds, for the memory context argument if someone
plays more with this module.  I don't mind the removal of the NULL
checks when calling bms_free() to cut lines in the module, though.  I
was also seeing an argument with cross-checking the frees after a
REALLOCATE_BITMAPSETS copy, as well, for sanity checks.  So I would
rather leave them as they are, not remove them.  Again, that may come
up to one's view on the matter and one's philosophy around these
modules.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
David Rowley
Дата:
On Wed, 1 Oct 2025 at 12:37, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Oct 01, 2025 at 12:00:39PM +1300, David Rowley wrote:
> > 1. Added helper macros PG_ARG_GETBITMAPSET() and
> > PG_RETURN_BITMAPSET_AS_TEXT(). These themselves save a few hundred
> > lines of code and make the functions easier to read.
>
> +#define PG_RETURN_BITMAPSET_AS_TEXT(bms) \
> +   do { \
> +       text       *result = BITMAPSET_TO_TEXT(bms); \
> +       bms_free(bms); \
> +       PG_RETURN_TEXT_P(result); \
> +   } while (0);
>
> This is also something I've considered as embedding in a macro, and
> discarded it.  The reason was code readability, because it gives to
> the reader a quick knowledge of what can be freed once a call to a
> specific bitmapset.c routine is done, acting as a kind of reference
> template.  Here, the free call knowledge gets hidden.  test_bms_copy()
> becomes more confusing to me.  This argument comes down to one's taste
> and philosophy on the matter of test modules.  These are for me tools
> of coverage, but also can work as reference templates when calling an
> internal routine because we also document some of its implied
> behaviors, like the inputs recycled.

This sounds like another argument towards getting rid of the bms_frees
from the module.

> > 2. Standardised that the test function returns NULL when it receives
> > invalid input. e.g if we aim to pass an "int" to the underlying
> > Bitmapset function, if someone calls the test function with an SQL
> > NULL, just return NULL rather than trying to come up with some logic
> > on what we should do about that.
>
> +static const char *emptyset = "(b)";
>
> Not sure that I'm on board with hardcoding this knowledge in the test
> module.  But well, if you feel strongly about it, I'm not going to put
> a fight for it.

I'm not a huge fan of that part either. I didn't feel strongly enough
to change it when adjusting Greg's v2. For me, I don't see the problem
with leaving it up to nodeToString(), which will return "<>" for an
empty set.

> > 3. I couldn't quite understand the extra parameter to
> > test_bms_get_singleton_member() so I ended up removing it and making
> > the test function return -1 when the set isn't a singleton. I didn't
> > search for discussion about that so I might have missed the specific
> > reason it was done that way.
>
> It is not very far.  In short, I am caring about the status returned
> by bms_get_singleton_member(), then translated in the output of the
> SQL function based on the default given in input:
> https://www.postgresql.org/message-id/aNolInqSzt1d1338@paquier.xyz
>
> Perhaps not the most elegant solution, but simpler than returning a
> setof without changing the meaning of the tests.  So we could return
> the status directly or use the default value from the input, not doing
> any of these actions reduces what's reported back at SQL level.

Thanks for explaining. Is anything being lost with the -1 return?
Since that's not a valid member for a set, -1 can only be returned if
bms_get_singleton_member() returned false.

> > 4. Wrote some comments to try to communicate the standards defined.
> > This aims to offer guidance to callers of these functions and anyone
> > hacking on the module itself in the future.
>
> +/* Helper macro to fetch Text parameters as Bitmapsets. SQL-NULL means empty set */
> +#define PG_ARG_GETBITMAPSET(n) \
> +   (PG_ARGISNULL(n) ? NULL : TEXT_TO_BITMAPSET(PG_GETARG_TEXT_PP(n)))
>
> True that embedding the PG_ARGISNULL() call may be better in a macro,
> as passing NULL values to the internal routines is OK.

OK good. There's a decent code reduction from this one.

> > I also wonder how worthwhile it is to try to free the Bitmapsets. I'd
> > have expected these allocations to be in a pretty short-lived memory
> > context. I count 44 calls to bms_free().
>
> I've wondered about that as well.  However, I have concluded to leave
> the free calls arounds, for the memory context argument if someone
> plays more with this module.  I don't mind the removal of the NULL
> checks when calling bms_free() to cut lines in the module, though.  I
> was also seeing an argument with cross-checking the frees after a
> REALLOCATE_BITMAPSETS copy, as well, for sanity checks.  So I would
> rather leave them as they are, not remove them.  Again, that may come
> up to one's view on the matter and one's philosophy around these
> modules.

Because you're not happy with embedding the bms_free() in with
PG_RETURN_BITMAPSET_AS_TEXT(), why don't we just get rid of all the
bms_free's?  It's just a test module. If anyone ever comes complaining
about memory usage then we can reconsider. I think it's nice to keep
these wrappers as thin as possible in terms of the amount of C code.
Having conditional logic in them, IMO, risk hiding the true behaviour
of the underlying code that's being tested.

I know I've just come in here like a bull in a china shop. It's not my
intention to imply that anything is wrong with the code. I just put
together my thoughts in the hopes that it can be made even better.

David



Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Wed, Oct 01, 2025 at 01:07:10PM +1300, David Rowley wrote:
> On Wed, 1 Oct 2025 at 12:37, Michael Paquier <michael@paquier.xyz> wrote:
>> This is also something I've considered as embedding in a macro, and
>> discarded it.  The reason was code readability, because it gives to
>> the reader a quick knowledge of what can be freed once a call to a
>> specific bitmapset.c routine is done, acting as a kind of reference
>> template.  Here, the free call knowledge gets hidden.  test_bms_copy()
>> becomes more confusing to me.  This argument comes down to one's taste
>> and philosophy on the matter of test modules.  These are for me tools
>> of coverage, but also can work as reference templates when calling an
>> internal routine because we also document some of its implied
>> behaviors, like the inputs recycled.
>
> This sounds like another argument towards getting rid of the bms_frees
> from the module.

With the RETURN macro in mind, which itself cuts quite a little bit of
code, I'd rather do as you are suggesting and remove the free calls,
then.  Embedding this knowledge  in the return macro seems not much
adapted to me, but I don't object to this macro idea per-se.

> I'm not a huge fan of that part either. I didn't feel strongly enough
> to change it when adjusting Greg's v2. For me, I don't see the problem
> with leaving it up to nodeToString(), which will return "<>" for an
> empty set.

I think it's fine to leave it to nodeToString().  It is able to handle
a NULL input on its own.

> Thanks for explaining. Is anything being lost with the -1 return?
> Since that's not a valid member for a set, -1 can only be returned if
> bms_get_singleton_member() returned false.

+   if (!bms_get_singleton_member(bms, &member))
+       member = -1;

Hmm.  On second look, if you do that, there's no knowledge lost, I
guess.

> Because you're not happy with embedding the bms_free() in with
> PG_RETURN_BITMAPSET_AS_TEXT(), why don't we just get rid of all the
> bms_free's?  It's just a test module. If anyone ever comes complaining
> about memory usage then we can reconsider. I think it's nice to keep
> these wrappers as thin as possible in terms of the amount of C code.
> Having conditional logic in them, IMO, risk hiding the true behaviour
> of the underlying code that's being tested.

Okay.  Let's just remove as much code as possible, then.  This is
roughly what you have sent, minus the hardcoded empty set and the free
calls.  I'll just go do that in a bit.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
David Rowley
Дата:
On Wed, 1 Oct 2025 at 16:41, Michael Paquier <michael@paquier.xyz> wrote:
> Okay.  Let's just remove as much code as possible, then.  This is
> roughly what you have sent, minus the hardcoded empty set and the free
> calls.  I'll just go do that in a bit.

Thanks.  Just for reference, you may not need it if you're mostly done
already, but I had to finish this off in my head before moving on.
Feel free to ignore, but there are a couple of typos fixed at the very
least "varrying" should be "varying".

I didn't fully rationalise the tests in test_bitmapset.sql.  There are
still a few which are duplicates due to '(b)' and NULL meaning the
same thing.

git diff --stat
 .../test_bitmapset/expected/test_bitmapset.out     | 400 ++------------
 .../modules/test_bitmapset/sql/test_bitmapset.sql  | 101 +---
 .../modules/test_bitmapset/test_bitmapset--1.0.sql |   2 +-
 src/test/modules/test_bitmapset/test_bitmapset.c   | 605 +++++----------------
 4 files changed, 222 insertions(+), 886 deletions(-)

I didn't look at the coverage report, but on thinking about it,
test_bitmapset.c's "PG_RETURN_NULL(); /* invalid input */" won't be
covered as I stripped out all the tests which purposefully pass
invalid input. You may not want it that way if you're keen to keep the
coverage of the test module high as well as bitmapset.c

David



Re: [PATCH] Add tests for Bitmapset

От
David Rowley
Дата:
On Wed, 1 Oct 2025 at 18:00, David Rowley <dgrowleyml@gmail.com> wrote:
> git diff --stat
>  .../test_bitmapset/expected/test_bitmapset.out     | 400 ++------------
>  .../modules/test_bitmapset/sql/test_bitmapset.sql  | 101 +---
>  .../modules/test_bitmapset/test_bitmapset--1.0.sql |   2 +-
>  src/test/modules/test_bitmapset/test_bitmapset.c   | 605 +++++----------------
>  4 files changed, 222 insertions(+), 886 deletions(-)

And once more, with the patch this time around...

David

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Wed, Oct 01, 2025 at 06:00:59PM +1300, David Rowley wrote:
> I didn't look at the coverage report, but on thinking about it,
> test_bitmapset.c's "PG_RETURN_NULL(); /* invalid input */" won't be
> covered as I stripped out all the tests which purposefully pass
> invalid input. You may not want it that way if you're keen to keep the
> coverage of the test module high as well as bitmapset.c

Yes, that's something I was just going through, with a result pretty
close to what you have sent, including the removal and replacement of
most of tests that were specific to the test module.  The prev/next
member coverage is still looking OK from here, though.

Thanks for the typos, I've missed these.
--
Michael

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Oct 1 2025, at 1:14 am, Michael Paquier <michael@paquier.xyz> wrote:

> On Wed, Oct 01, 2025 at 06:00:59PM +1300, David Rowley wrote:
>> I didn't look at the coverage report, but on thinking about it,
>> test_bitmapset.c's "PG_RETURN_NULL(); /* invalid input */" won't be
>> covered as I stripped out all the tests which purposefully pass
>> invalid input. You may not want it that way if you're keen to keep the
>> coverage of the test module high as well as bitmapset.c
> 
> Yes, that's something I was just going through, with a result pretty
> close to what you have sent, including the removal and replacement of
> most of tests that were specific to the test module.  The prev/next
> member coverage is still looking OK from here, though.
> 
> Thanks for the typos, I've missed these.
> --
> Michael

Michael, David,

Thank you. :)

The coverage[1] looks to be 100% for everything in bitmapset.c now and
98.6% of the test code[2] for Bitmapset -- more than enough and much
better than before.  I do like the cleanup/simplification that you two
jointly applied. I think it's easier to maintain and reason about and
looking at the final result I can see the logic in the simplified
approach.  I've learned a lot from this exercise not only about
Bitmapset and various other areas in the code but also the
approach/model for a task like this.  I'll keep that in mind going forward.

best.

-greg


[1] https://coverage.postgresql.org/src/backend/nodes/bitmapset.c.gcov.html
[2] https://coverage.postgresql.org/src/test/modules/test_bitmapset/test_bitmapset.c.gcov.html




Re: [PATCH] Add tests for Bitmapset

От
Daniel Gustafsson
Дата:
> On 1 Oct 2025, at 07:14, Michael Paquier <michael@paquier.xyz> wrote:

> Thanks for the typos, I've missed these.

Doing post-commit review I didn't see any sharp edges, but found one more of
these so will push this shortly:

--- Substraction to empty
+-- Subtraction to empty

--
Daniel Gustafsson




Re: [PATCH] Add tests for Bitmapset

От
Greg Burd
Дата:
On Oct 2 2025, at 6:00 am, Daniel Gustafsson <daniel@yesql.se> wrote:

>> On 1 Oct 2025, at 07:14, Michael Paquier <michael@paquier.xyz> wrote:
> 
>> Thanks for the typos, I've missed these.
> 
> Doing post-commit review I didn't see any sharp edges, but found one
> more of
> these so will push this shortly:
> 
> --- Substraction to empty
> +-- Subtraction to empty

+1, thanks.  I'll be more careful in the future.

> --
> Daniel Gustafsson

best.

-greg



Re: [PATCH] Add tests for Bitmapset

От
David Rowley
Дата:
On Thu, 2 Oct 2025 at 23:00, Daniel Gustafsson <daniel@yesql.se> wrote:
> Doing post-commit review I didn't see any sharp edges, but found one more of
> these so will push this shortly:

Any chance you could also delete the "/* memory cleanup seems more
tricky than it's worth here */" line? That comment seemed relevant
when we were actually doing bms_free() in the functions. Now that
we're not, it's a bit out of place.

David



Re: [PATCH] Add tests for Bitmapset

От
Daniel Gustafsson
Дата:
> On 2 Oct 2025, at 14:11, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Thu, 2 Oct 2025 at 23:00, Daniel Gustafsson <daniel@yesql.se> wrote:
>> Doing post-commit review I didn't see any sharp edges, but found one more of
>> these so will push this shortly:
>
> Any chance you could also delete the "/* memory cleanup seems more
> tricky than it's worth here */" line? That comment seemed relevant
> when we were actually doing bms_free() in the functions. Now that
> we're not, it's a bit out of place.

Sure I can take care of that while in there.

Another nitpick would be to remove the test for NULL in test_bms_make_singleton
since that is a STRICT function, making the test for NULL superfluous code:

diff --git a/src/test/modules/test_bitmapset/test_bitmapset.c b/src/test/modules/test_bitmapset/test_bitmapset.c
index 0d6c2e7aa1b..acaa93d2f11 100644
--- a/src/test/modules/test_bitmapset/test_bitmapset.c
+++ b/src/test/modules/test_bitmapset/test_bitmapset.c
@@ -201,9 +201,6 @@ test_bms_make_singleton(PG_FUNCTION_ARGS)
        Bitmapset  *bms;
        int32           member;

-       if (PG_ARGISNULL(0))
-               PG_RETURN_NULL();               /* invalid input */
-
        member = PG_GETARG_INT32(0);
        bms = bms_make_singleton(member);

I'll include all of these three tiny improvements in a commit later today.

--
Daniel Gustafsson




Re: [PATCH] Add tests for Bitmapset

От
David Rowley
Дата:
On Fri, 3 Oct 2025 at 01:33, Daniel Gustafsson <daniel@yesql.se> wrote:
> Another nitpick would be to remove the test for NULL in test_bms_make_singleton
> since that is a STRICT function, making the test for NULL superfluous code:

I didn't notice that was marked as strict. It would be good to get rid
of those lines in that case.

Thanks

David



Re: [PATCH] Add tests for Bitmapset

От
Daniel Gustafsson
Дата:
> On 2 Oct 2025, at 23:00, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Fri, 3 Oct 2025 at 01:33, Daniel Gustafsson <daniel@yesql.se> wrote:
>> Another nitpick would be to remove the test for NULL in test_bms_make_singleton
>> since that is a STRICT function, making the test for NULL superfluous code:
>
> I didn't notice that was marked as strict. It would be good to get rid
> of those lines in that case.

Done, comment cleanup and nullness check pushed.

--
Daniel Gustafsson




Re: [PATCH] Add tests for Bitmapset

От
David Rowley
Дата:
On Fri, 3 Oct 2025 at 01:33, Daniel Gustafsson <daniel@yesql.se> wrote:
> Another nitpick would be to remove the test for NULL in test_bms_make_singleton
> since that is a STRICT function, making the test for NULL superfluous code:

I see test_random_operations() is also strict. Is it worth getting rid
of the SQL NULL checks on the inputs there too? Aka, the attached.

David

Вложения

Re: [PATCH] Add tests for Bitmapset

От
Michael Paquier
Дата:
On Fri, Oct 03, 2025 at 12:36:57PM +1300, David Rowley wrote:
> I see test_random_operations() is also strict. Is it worth getting rid
> of the SQL NULL checks on the inputs there too? Aka, the attached.

Yes, this was not bothering me much.  Anyway, agreed about removing
these as well.  I cannot find an argument in favor of making the
function callable on NULL input with default values that are hardcoded
internally in the function.

Thanks, David and Daniel.
--
Michael

Вложения