Am 21.07.22 um 10:41 schrieb Dean Rasheed:
>
> A couple of quick comments on the current patch:
Thank you for your feedback!
> It's important to mark these new functions as VOLATILE, not IMMUTABLE,
> otherwise they won't work as expected in queries. See
> https://www.postgresql.org/docs/current/xfunc-volatility.html
CREATE FUNCTION marks functions as VOLATILE by default if not explicitly
marked otherwise. I assumed function definitions in pg_proc.dat have the
same behavior. I will fix that.
https://www.postgresql.org/docs/current/sql-createfunction.html
> It would be better to use pg_prng_uint64_range() rather than rand() to
> pick elements. Partly, that's because it uses a higher quality PRNG,
> with a larger internal state, and it ensures that the results are
> unbiased across the range. But more importantly, it interoperates with
> setseed(), allowing predictable sequences of "random" numbers to be
> generated -- something that's useful in writing repeatable regression
> tests.
I agree that we should use pg_prng_uint64_range(). However, in order to
achieve interoperability with setseed() we would have to use
drandom_seed (rather than pg_global_prng_state) as rng state, which is
declared statically in float.c and exclusively used by random(). Do we
want to expose drandom_seed to other functions?
> Assuming these new functions are made to interoperate with setseed(),
> which I think they should be, then they also need to be marked as
> PARALLEL RESTRICTED, rather than PARALLEL SAFE. See
> https://www.postgresql.org/docs/current/parallel-safety.html, which
> explains why setseed() and random() are parallel restricted.
As mentioned above, i assumed the default here is PARALLEL UNSAFE. I'll
fix that.
> In my experience, the requirement for sampling with replacement is
> about as common as the requirement for sampling without replacement,
> so it seems odd to provide one but not the other. Of course, we can
> always add a with-replacement function later, and give it a different
> name. But maybe array_sample() could be used for both, with a
> "with_replacement" boolean parameter?
We can also add a "with_replacement" boolean parameter which is false by
default to array_sample() later. I do not have a strong opinion about
that and will implement it, if desired. Any opinions about
default/no-default?
Martin