Обсуждение: pg_config wrongly marked as not parallel safe?

Поиск
Список
Период
Сортировка

pg_config wrongly marked as not parallel safe?

От
Andres Freund
Дата:
Hi,

Triggered by the thread at [1] I looked for functions marked as
immutable but not parallel safe.

postgres[19492][1]=# SELECT oid::regprocedure, provolatile, proparallel FROM pg_proc WHERE provolatile = 'i' AND
proparallel!= 's';
 
┌─────────────┬─────────────┬─────────────┐
│     oid     │ provolatile │ proparallel │
├─────────────┼─────────────┼─────────────┤
│ pg_config() │ i           │ r           │
└─────────────┴─────────────┴─────────────┘
(1 row)

# pg_config
{ oid => '3400', descr => 'pg_config binary as a function',
  proname => 'pg_config', prorows => '23', proretset => 't', proparallel => 'r',
  prorettype => 'record', proargtypes => '', proallargtypes => '{text,text}',
  proargmodes => '{o,o}', proargnames => '{name,setting}',
  prosrc => 'pg_config' },

so that function is marked as immutable but not parallel safe, without
an explanation for that odd combination.

Now obviously I don't think it practially matters for pg_config(), but
it seems unnecessarily confusing as a general matter.

I think we probably should fix this specific case, and then add a check
to opr_sanity.sql or such.

As far as I can tell pg_config() was marked as such since its addition
in [2]. Joe, I assume this wasn't intentional?

Greetings,

Andres Freund

[1] https://www.postgresql.org/message-id/CA+6d-n4dOakgLu2gsTfk9uD2CC9ueNCg+z_mnXA2-=Qaod1Wuw@mail.gmail.com
[2] http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=a5c43b886942e96ec5c745041f2d6a50c3205147


Re: pg_config wrongly marked as not parallel safe?

От
Joe Conway
Дата:
On 11/26/18 6:45 PM, Andres Freund wrote:
> Hi,
> 
> Triggered by the thread at [1] I looked for functions marked as
> immutable but not parallel safe.
> 
> postgres[19492][1]=# SELECT oid::regprocedure, provolatile, proparallel FROM pg_proc WHERE provolatile = 'i' AND
proparallel!= 's';
 
> ┌─────────────┬─────────────┬─────────────┐
> │     oid     │ provolatile │ proparallel │
> ├─────────────┼─────────────┼─────────────┤
> │ pg_config() │ i           │ r           │
> └─────────────┴─────────────┴─────────────┘
> (1 row)
> 
> # pg_config
> { oid => '3400', descr => 'pg_config binary as a function',
>   proname => 'pg_config', prorows => '23', proretset => 't', proparallel => 'r',
>   prorettype => 'record', proargtypes => '', proallargtypes => '{text,text}',
>   proargmodes => '{o,o}', proargnames => '{name,setting}',
>   prosrc => 'pg_config' },
> 
> so that function is marked as immutable but not parallel safe, without
> an explanation for that odd combination.
> 
> Now obviously I don't think it practially matters for pg_config(), but
> it seems unnecessarily confusing as a general matter.
> 
> I think we probably should fix this specific case, and then add a check
> to opr_sanity.sql or such.
> 
> As far as I can tell pg_config() was marked as such since its addition
> in [2]. Joe, I assume this wasn't intentional?

Not intentional. Though, sitting here chatting with Stephen about it, I
am now wondering if pg_config() should actually be marked immutable:

select * from pg_config() where name = 'VERSION';
  name   |     setting
---------+-----------------
 VERSION | PostgreSQL 10.5
(1 row)

[...upgrade the postgres binaries...]

select * from pg_config() where name = 'VERSION';
  name   |     setting
---------+-----------------
 VERSION | PostgreSQL 10.6
(1 row)

So the correct answer is probably to mark pg_config() stable, but it
still seems to be parallel safe to me.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Re: pg_config wrongly marked as not parallel safe?

От
Andres Freund
Дата:
On 2018-11-26 19:04:46 -0500, Joe Conway wrote:
> On 11/26/18 6:45 PM, Andres Freund wrote:
> > Hi,
> > 
> > Triggered by the thread at [1] I looked for functions marked as
> > immutable but not parallel safe.
> > 
> > postgres[19492][1]=# SELECT oid::regprocedure, provolatile, proparallel FROM pg_proc WHERE provolatile = 'i' AND
proparallel!= 's';
 
> > ┌─────────────┬─────────────┬─────────────┐
> > │     oid     │ provolatile │ proparallel │
> > ├─────────────┼─────────────┼─────────────┤
> > │ pg_config() │ i           │ r           │
> > └─────────────┴─────────────┴─────────────┘
> > (1 row)
> > 
> > # pg_config
> > { oid => '3400', descr => 'pg_config binary as a function',
> >   proname => 'pg_config', prorows => '23', proretset => 't', proparallel => 'r',
> >   prorettype => 'record', proargtypes => '', proallargtypes => '{text,text}',
> >   proargmodes => '{o,o}', proargnames => '{name,setting}',
> >   prosrc => 'pg_config' },
> > 
> > so that function is marked as immutable but not parallel safe, without
> > an explanation for that odd combination.
> > 
> > Now obviously I don't think it practially matters for pg_config(), but
> > it seems unnecessarily confusing as a general matter.
> > 
> > I think we probably should fix this specific case, and then add a check
> > to opr_sanity.sql or such.
> > 
> > As far as I can tell pg_config() was marked as such since its addition
> > in [2]. Joe, I assume this wasn't intentional?
> 
> Not intentional. Though, sitting here chatting with Stephen about it, I
> am now wondering if pg_config() should actually be marked immutable:
> 
> select * from pg_config() where name = 'VERSION';
>   name   |     setting
> ---------+-----------------
>  VERSION | PostgreSQL 10.5
> (1 row)
> 
> [...upgrade the postgres binaries...]
> 
> select * from pg_config() where name = 'VERSION';
>   name   |     setting
> ---------+-----------------
>  VERSION | PostgreSQL 10.6
> (1 row)
> 
> So the correct answer is probably to mark pg_config() stable, but it
> still seems to be parallel safe to me.

I don't think we should consider immutability to mean anything across
major versions. What'd be helped by doing that? We'd have to rule out
any behaviour change to any immutable function for that to make
sense. Including making an immutable function not immutable anymore.

- Andres


Re: pg_config wrongly marked as not parallel safe?

От
Joe Conway
Дата:
On 11/26/18 7:08 PM, Andres Freund wrote:
> On 2018-11-26 19:04:46 -0500, Joe Conway wrote:
>> Not intentional. Though, sitting here chatting with Stephen about it, I
>> am now wondering if pg_config() should actually be marked immutable:
>>
>> select * from pg_config() where name = 'VERSION';
>>   name   |     setting
>> ---------+-----------------
>>  VERSION | PostgreSQL 10.5
>> (1 row)
>>
>> [...upgrade the postgres binaries...]
>>
>> select * from pg_config() where name = 'VERSION';
>>   name   |     setting
>> ---------+-----------------
>>  VERSION | PostgreSQL 10.6
>> (1 row)
>>
>> So the correct answer is probably to mark pg_config() stable, but it
>> still seems to be parallel safe to me.
> 
> I don't think we should consider immutability to mean anything across
> major versions. What'd be helped by doing that? We'd have to rule out
> any behaviour change to any immutable function for that to make
> sense. Including making an immutable function not immutable anymore.

Umm, this is a minor version not major.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Re: pg_config wrongly marked as not parallel safe?

От
Stephen Frost
Дата:
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2018-11-26 19:04:46 -0500, Joe Conway wrote:
> > So the correct answer is probably to mark pg_config() stable, but it
> > still seems to be parallel safe to me.
>
> I don't think we should consider immutability to mean anything across
> major versions. What'd be helped by doing that? We'd have to rule out
> any behaviour change to any immutable function for that to make
> sense. Including making an immutable function not immutable anymore.

Then we have to require that all indexes built with immutable functions
be rebuilt when someone does a pg_upgrade from one major version to the
next.

Not to mention that the issue at hand isn't a major version upgrade
anyway, it's a minor version change...

Thanks!

Stephen

Вложения

Re: pg_config wrongly marked as not parallel safe?

От
Andres Freund
Дата:
On 2018-11-26 19:14:24 -0500, Joe Conway wrote:
> On 11/26/18 7:08 PM, Andres Freund wrote:
> > On 2018-11-26 19:04:46 -0500, Joe Conway wrote:
> >> Not intentional. Though, sitting here chatting with Stephen about it, I
> >> am now wondering if pg_config() should actually be marked immutable:
> >>
> >> select * from pg_config() where name = 'VERSION';
> >>   name   |     setting
> >> ---------+-----------------
> >>  VERSION | PostgreSQL 10.5
> >> (1 row)
> >>
> >> [...upgrade the postgres binaries...]
> >>
> >> select * from pg_config() where name = 'VERSION';
> >>   name   |     setting
> >> ---------+-----------------
> >>  VERSION | PostgreSQL 10.6
> >> (1 row)
> >>
> >> So the correct answer is probably to mark pg_config() stable, but it
> >> still seems to be parallel safe to me.
> > 
> > I don't think we should consider immutability to mean anything across
> > major versions. What'd be helped by doing that? We'd have to rule out
> > any behaviour change to any immutable function for that to make
> > sense. Including making an immutable function not immutable anymore.
> 
> Umm, this is a minor version not major.

Oops.

Greetings,

Andres Freund


Re: pg_config wrongly marked as not parallel safe?

От
Andres Freund
Дата:
Hi,

On 2018-11-26 19:16:00 -0500, Stephen Frost wrote:
> Greetings,
> 
> * Andres Freund (andres@anarazel.de) wrote:
> > On 2018-11-26 19:04:46 -0500, Joe Conway wrote:
> > > So the correct answer is probably to mark pg_config() stable, but it
> > > still seems to be parallel safe to me.
> > 
> > I don't think we should consider immutability to mean anything across
> > major versions. What'd be helped by doing that? We'd have to rule out
> > any behaviour change to any immutable function for that to make
> > sense. Including making an immutable function not immutable anymore.
> 
> Then we have to require that all indexes built with immutable functions
> be rebuilt when someone does a pg_upgrade from one major version to the
> next.

Happened a couple times. The harm from overaggressively removing
immutability, and thus not even being able to add such indexes and
obviously worse query plans, seems much bigger than avoiding the need to
rebuild indexes in some cases.


> Not to mention that the issue at hand isn't a major version upgrade
> anyway, it's a minor version change...

Yea, that obviously makes this different.  Apparently the old versioning
scheme is stuck in my head on a deep enough level...

Greetings,

Andres Freund


Re: pg_config wrongly marked as not parallel safe?

От
Stephen Frost
Дата:
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2018-11-26 19:16:00 -0500, Stephen Frost wrote:
> > * Andres Freund (andres@anarazel.de) wrote:
> > > On 2018-11-26 19:04:46 -0500, Joe Conway wrote:
> > > > So the correct answer is probably to mark pg_config() stable, but it
> > > > still seems to be parallel safe to me.
> > >
> > > I don't think we should consider immutability to mean anything across
> > > major versions. What'd be helped by doing that? We'd have to rule out
> > > any behaviour change to any immutable function for that to make
> > > sense. Including making an immutable function not immutable anymore.
> >
> > Then we have to require that all indexes built with immutable functions
> > be rebuilt when someone does a pg_upgrade from one major version to the
> > next.
>
> Happened a couple times. The harm from overaggressively removing
> immutability, and thus not even being able to add such indexes and
> obviously worse query plans, seems much bigger than avoiding the need to
> rebuild indexes in some cases.

These two things seem entirely independent in my view, so I'm really not
sure that I'm following what you're getting at.

We have required specific kinds of indexes to be rebuilt across major
version upgrades, yes, but those have almost uniformly been cases where
the index was very rarely used or only used in some corner cases.

I certainly do not believe we should invalidate all indexes built with
immutable functions across major versions upgrades.  Perhaps in some
cases when we've decided that an immutable function today should work in
some different way in a new version then we should complain about or
invalidate those indexes, but that's a special case.  Taken to an
extreme, you could argue that partial indexes should also be rebuilt on
major version upgrades, in case we decide that 1 is suddently less than
2, but instead of baking that assumption in, let's just deal with those
cases when they come up and write logic into pg_upgrade to detect such
usages and make the user aware of them.

If you feel that immutability is being overaggressively removed from
something, that's fair, but should also be addressed on a case-by-case
basis.  In this particular case, I don't see any concern that we're
going to break user's indexes by changing pg_config() to be stable
instead of immutable, and I don't really think we even need to write
logic into pg_upgrade for it- unless it could be generalized to detect
such cases between versions, in which case I'd be all for that in case
we have other such changes.  Not sure it's really worth it though, we
don't have that happen all that often.

Thanks!

Stpehen

Вложения

Re: pg_config wrongly marked as not parallel safe?

От
Andres Freund
Дата:
On 2018-11-26 19:34:03 -0500, Stephen Frost wrote:
> These two things seem entirely independent in my view, so I'm really not
> sure that I'm following what you're getting at.

All I said is that I don't think it's a reasonable policy to mark all
functions that potentially could change across major versions as
immutable.  I've no problem with changing pg_config in particular,
especially as it - as has been pointed - clearly can change in minor
versions / recompiles.


Re: pg_config wrongly marked as not parallel safe?

От
Stephen Frost
Дата:
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2018-11-26 19:34:03 -0500, Stephen Frost wrote:
> > These two things seem entirely independent in my view, so I'm really not
> > sure that I'm following what you're getting at.
>
> All I said is that I don't think it's a reasonable policy to mark all
> functions that potentially could change across major versions as
> immutable.  I've no problem with changing pg_config in particular,
> especially as it - as has been pointed - clearly can change in minor
> versions / recompiles.

I'm...  not following.

If a function's results can change across minor or major versions, we
shouldn't be marking it as immutable because, by definition, it's not
immutable.

We, today, have a baked in assumption that any function marked as
immutable will remain immutable across all major versions that we allow
indexes to be retained through, which is all of them since about 8.3 at
this point.

We absolutely need a policy that if you decide to change the results of
some immutable function across a major version change, you need to
consider the results on indexes and possibly write into pg_upgrade
checks to try and detect any usage of that immutable function.  I hope
we're in agreement there.

In other words, maybe it isn't a sealed-in-concrete policy that you
can't go around changing what an immutable function returns, but you
certainly better have a really good justification for it, and write
all of the code to detect any cases where that could cause incorrect
results from the database, including pg_upgrade throwing a big error if
it finds any indexes (or maybe even functions...) using it, and even
then I'm pretty darn skeptical about accepting it.

Thanks!

Stephen

Вложения

Re: pg_config wrongly marked as not parallel safe?

От
Laurenz Albe
Дата:
On Mon, 2018-11-26 at 19:51 -0500, Stephen Frost wrote:
> If a function's results can change across minor or major versions, we
> shouldn't be marking it as immutable because, by definition, it's not
> immutable.
> 
> We, today, have a baked in assumption that any function marked as
> immutable will remain immutable across all major versions that we allow
> indexes to be retained through, which is all of them since about 8.3 at
> this point.
> 
> We absolutely need a policy that if you decide to change the results of
> some immutable function across a major version change, you need to
> consider the results on indexes and possibly write into pg_upgrade
> checks to try and detect any usage of that immutable function.  I hope
> we're in agreement there.

It's hard to make a guarantee that a function will never change in the
future.  What if we fix some rounding or overflow problem in a floating
point function?

If we went that far, hardly any function could be IMMUTABLE.

I think it is best to use IMMUTABLE whenever we don't expect it to
change and it is a function useful for indexes, and if it happens to
change nonetheless, write into the release notes that certain indexes
have to be rebuilt after upgrade.

Of course, there is no problem to mark pg_config as stable, because
there is little chance it will be used in an index anyway.

Yours,
Laurenz Albe



Re: pg_config wrongly marked as not parallel safe?

От
Kyotaro HORIGUCHI
Дата:
At Tue, 27 Nov 2018 06:33:42 +0100, Laurenz Albe <laurenz.albe@cybertec.at> wrote in
<8dcb0bb248e9aaef0f1ef0faa27ab583bfa5bb84.camel@cybertec.at>
> On Mon, 2018-11-26 at 19:51 -0500, Stephen Frost wrote:
> > If a function's results can change across minor or major versions, we
> > shouldn't be marking it as immutable because, by definition, it's not
> > immutable.
> > 
> > We, today, have a baked in assumption that any function marked as
> > immutable will remain immutable across all major versions that we allow
> > indexes to be retained through, which is all of them since about 8.3 at
> > this point.
> > 
> > We absolutely need a policy that if you decide to change the results of
> > some immutable function across a major version change, you need to
> > consider the results on indexes and possibly write into pg_upgrade
> > checks to try and detect any usage of that immutable function.  I hope
> > we're in agreement there.
> 
> It's hard to make a guarantee that a function will never change in the
> future.  What if we fix some rounding or overflow problem in a floating
> point function?

+1.

Actually, some geometric comparisons are performed counting
tolerance margin, the validity of which is in doubt.  Their
behavior has been changed in recent major version and still has a
room for improvement, and the functions are parallel-safe and
immutable. Immutablity is mentiond mainly in the light of
optimization in the documentation.

https://www.postgresql.org/docs/current/xfunc-volatility.html

> If we went that far, hardly any function could be IMMUTABLE.
> 
> I think it is best to use IMMUTABLE whenever we don't expect it to
> change and it is a function useful for indexes, and if it happens to
> change nonetheless, write into the release notes that certain indexes
> have to be rebuilt after upgrade.
> Of course, there is no problem to mark pg_config as stable, because
> there is little chance it will be used in an index anyway.

Agreed. We have the following description in the documentation.

| A common error is to label a function IMMUTABLE when its results
| depend on a configuration parameter. For example, a function that
| manipulates timestamps might well have results that depend on the
| TimeZone setting. For safety, such functions should be labeled
| STABLE instead.

Still this isn't mentioning upgrades. We could add to this that
something like that immutablity is not guaranteed beyond
PostgreSQL versions, the results of immutable functions might
change for certain reasons including bug fixes, blah, blah..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: pg_config wrongly marked as not parallel safe?

От
Stephen Frost
Дата:
Greetings,

* Kyotaro HORIGUCHI (horiguchi.kyotaro@lab.ntt.co.jp) wrote:
> At Tue, 27 Nov 2018 06:33:42 +0100, Laurenz Albe <laurenz.albe@cybertec.at> wrote in
<8dcb0bb248e9aaef0f1ef0faa27ab583bfa5bb84.camel@cybertec.at>
> > On Mon, 2018-11-26 at 19:51 -0500, Stephen Frost wrote:
> > > If a function's results can change across minor or major versions, we
> > > shouldn't be marking it as immutable because, by definition, it's not
> > > immutable.
> > >
> > > We, today, have a baked in assumption that any function marked as
> > > immutable will remain immutable across all major versions that we allow
> > > indexes to be retained through, which is all of them since about 8.3 at
> > > this point.
> > >
> > > We absolutely need a policy that if you decide to change the results of
> > > some immutable function across a major version change, you need to
> > > consider the results on indexes and possibly write into pg_upgrade
> > > checks to try and detect any usage of that immutable function.  I hope
> > > we're in agreement there.
> >
> > It's hard to make a guarantee that a function will never change in the
> > future.  What if we fix some rounding or overflow problem in a floating
> > point function?
>
> +1.
>
> Actually, some geometric comparisons are performed counting
> tolerance margin, the validity of which is in doubt.  Their
> behavior has been changed in recent major version and still has a
> room for improvement, and the functions are parallel-safe and
> immutable. Immutablity is mentiond mainly in the light of
> optimization in the documentation.

I really don't buy off on these arguments in the least.  I also didn't
say that a function wasn't allowed to change- but that the output of an
immutable function, for a given input, shouldn't change and in the very
rare case where we absolutely had to make a change, it had better be for
a very good reason and we need to consider the impact on user indexes.

I don't agree that we can simply declare that all functional and partial
indexes have to be rebuilt between every major version upgrade, which is
the alternative.  There really isn't another option- either we're
extremely careful and take immutability of functions seriously and make
sure to preserve behavior, and therefore indexes, across major versions,
or we don't and we require indexes to be rebuilt.

When it comes to the documentation, perhaps we should improve it to make
it clear that if you change the results from an immutable function then
you need to rebuild any indexes which use it.

> > If we went that far, hardly any function could be IMMUTABLE.

I don't agree.

> > I think it is best to use IMMUTABLE whenever we don't expect it to
> > change and it is a function useful for indexes, and if it happens to
> > change nonetheless, write into the release notes that certain indexes
> > have to be rebuilt after upgrade.

I don't agree with the notion of "if it happens to change" as that
sounds pretty cavalier, and only noting something in the release notes
is a good way to cause a lot of pain for users.  If it's possible to do
better, which it often is (see the list of things pg_upgrade already
checks...) then we should do so.  If there's reasons we can't, fine, it
happens, but that needs to be taken into account when we're considering
making such a change and its impact on users.

> > Of course, there is no problem to mark pg_config as stable, because
> > there is little chance it will be used in an index anyway.
>
> Agreed. We have the following description in the documentation.

I'm pretty sure we all agree that pg_config should be marked as stable
and that changing it from immutable to stable won't be an issue.

> | A common error is to label a function IMMUTABLE when its results
> | depend on a configuration parameter. For example, a function that
> | manipulates timestamps might well have results that depend on the
> | TimeZone setting. For safety, such functions should be labeled
> | STABLE instead.
>
> Still this isn't mentioning upgrades. We could add to this that
> something like that immutablity is not guaranteed beyond
> PostgreSQL versions, the results of immutable functions might
> change for certain reasons including bug fixes, blah, blah..

Which means we're telling every user out there that they have to rebuild
every functional or partial index they have across every major version.

I don't agree with that and I don't think we really have some pressing
need to be that free with making changes to immutable functions.

I do think it'd be good to make a mention when talking about immutable
functions that they're allowed to be used in indexes and, as such, users
need to be careful to make sure that they don't change the results from
an immutable function unless they're sure it's not used in an index or
they're prepared to rebuild all indexes where it is used.

Thanks!

Stephen

Вложения

Re: pg_config wrongly marked as not parallel safe?

От
Kyotaro HORIGUCHI
Дата:
At Wed, 28 Nov 2018 21:17:17 -0500, Stephen Frost <sfrost@snowman.net> wrote in
<20181129021717.GT3415@tamriel.snowman.net>
> Greetings,
> 
> * Kyotaro HORIGUCHI (horiguchi.kyotaro@lab.ntt.co.jp) wrote:
> > At Tue, 27 Nov 2018 06:33:42 +0100, Laurenz Albe <laurenz.albe@cybertec.at> wrote in
<8dcb0bb248e9aaef0f1ef0faa27ab583bfa5bb84.camel@cybertec.at>
> > > On Mon, 2018-11-26 at 19:51 -0500, Stephen Frost wrote:
> > > > If a function's results can change across minor or major versions, we
> > > > shouldn't be marking it as immutable because, by definition, it's not
> > > > immutable.
> > > > 
> > > > We, today, have a baked in assumption that any function marked as
> > > > immutable will remain immutable across all major versions that we allow
> > > > indexes to be retained through, which is all of them since about 8.3 at
> > > > this point.
> > > > 
> > > > We absolutely need a policy that if you decide to change the results of
> > > > some immutable function across a major version change, you need to
> > > > consider the results on indexes and possibly write into pg_upgrade
> > > > checks to try and detect any usage of that immutable function.  I hope
> > > > we're in agreement there.
> > > 
> > > It's hard to make a guarantee that a function will never change in the
> > > future.  What if we fix some rounding or overflow problem in a floating
> > > point function?
> > 
> > +1.
> > 
> > Actually, some geometric comparisons are performed counting
> > tolerance margin, the validity of which is in doubt.  Their
> > behavior has been changed in recent major version and still has a
> > room for improvement, and the functions are parallel-safe and
> > immutable. Immutablity is mentiond mainly in the light of
> > optimization in the documentation.
> 
> I really don't buy off on these arguments in the least.  I also didn't
> say that a function wasn't allowed to change- but that the output of an
> immutable function, for a given input, shouldn't change and in the very
> rare case where we absolutely had to make a change, it had better be for
> a very good reason and we need to consider the impact on user indexes.
> 
> I don't agree that we can simply declare that all functional and partial
> indexes have to be rebuilt between every major version upgrade, which is
> the alternative.  There really isn't another option- either we're
> extremely careful and take immutability of functions seriously and make
> sure to preserve behavior, and therefore indexes, across major versions,
> or we don't and we require indexes to be rebuilt.

Even though index is the major place where the results of
immutable funcions are stored without uesrs' clear consciousness,
things are not restricted to indexes. Just as a thouht experiment
with an extreme case, we can store the results of immutable
functions into tables even whether it makes sense or not. We can
dump'n-load onto PostgreSQL running on another platform. I'm not
sure the new database can work as the same with the original one.

I can't express my feeling well, but this is a problem of how far
we guarantee the immutability after all. We guarantee
planning-execution duration immutability, yes. We guarantee using
plan caches, yes. Beyond restart, yes. Beyond minor-versions,
maybe. Beyond major-versions, presumably?  Beyond platform,
improbably?

For example, avg(), sum(), stddev() are immutable but I'm not
sure they are absolutely and in-exact-meaning immutable in all of
the senses above. (However, I don't think the error makes
problematic differences..)

# Yes, as I said above this is a very extreme case.

We might need to split the immutable into two kinds of volatility
level. Immutable as it can be used for indexes, and immutable-2
that immutable at least on running server. I'm not sure there are
any function or how many of the funcitons are categorized to the
latter, or we can sort them out. And I thing that it would be
inconvenient if avg and sum were moved out of the former
category..

> When it comes to the documentation, perhaps we should improve it to make
> it clear that if you change the results from an immutable function then
> you need to rebuild any indexes which use it.
> 
> > > If we went that far, hardly any function could be IMMUTABLE.
> 
> I don't agree.
> 
> > > I think it is best to use IMMUTABLE whenever we don't expect it to
> > > change and it is a function useful for indexes, and if it happens to
> > > change nonetheless, write into the release notes that certain indexes
> > > have to be rebuilt after upgrade.
> 
> I don't agree with the notion of "if it happens to change" as that
> sounds pretty cavalier, and only noting something in the release notes
> is a good way to cause a lot of pain for users.  If it's possible to do
> better, which it often is (see the list of things pg_upgrade already
> checks...) then we should do so.  If there's reasons we can't, fine, it
> happens, but that needs to be taken into account when we're considering
> making such a change and its impact on users.
> 
> > > Of course, there is no problem to mark pg_config as stable, because
> > > there is little chance it will be used in an index anyway.
> > 
> > Agreed. We have the following description in the documentation.
> 
> I'm pretty sure we all agree that pg_config should be marked as stable
> and that changing it from immutable to stable won't be an issue.
> 
> > | A common error is to label a function IMMUTABLE when its results
> > | depend on a configuration parameter. For example, a function that
> > | manipulates timestamps might well have results that depend on the
> > | TimeZone setting. For safety, such functions should be labeled
> > | STABLE instead.
> > 
> > Still this isn't mentioning upgrades. We could add to this that
> > something like that immutablity is not guaranteed beyond
> > PostgreSQL versions, the results of immutable functions might
> > change for certain reasons including bug fixes, blah, blah..
> 
> Which means we're telling every user out there that they have to rebuild
> every functional or partial index they have across every major version.
> 
> I don't agree with that and I don't think we really have some pressing
> need to be that free with making changes to immutable functions.
> 
> I do think it'd be good to make a mention when talking about immutable
> functions that they're allowed to be used in indexes and, as such, users
> need to be careful to make sure that they don't change the results from
> an immutable function unless they're sure it's not used in an index or
> they're prepared to rebuild all indexes where it is used.

I suppose that not a minor part of users aren't conscious of
immutability so clearly..

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: pg_config wrongly marked as not parallel safe?

От
Laurenz Albe
Дата:
On Wed, 2018-11-28 at 21:17 -0500, Stephen Frost wrote:
> I don't agree that we can simply declare that all functional and partial
> indexes have to be rebuilt between every major version upgrade, which is
> the alternative.  There really isn't another option- either we're
> extremely careful and take immutability of functions seriously and make
> sure to preserve behavior, and therefore indexes, across major versions,
> or we don't and we require indexes to be rebuilt.

I absolutely agree with that.

I guess that when I said that we should declare functions IMMUTABLE
even if they might change owing to bug fixes, I didn't make it clear that
each such occurrence would necessitate a fat warning in the release notes
that indexes using them have to be rebuilt.

Yours,
Laurenz Albe



Re: pg_config wrongly marked as not parallel safe?

От
Tom Lane
Дата:
Stephen Frost <sfrost@snowman.net> writes:
> We, today, have a baked in assumption that any function marked as
> immutable will remain immutable across all major versions that we allow
> indexes to be retained through, which is all of them since about 8.3 at
> this point.

That's the ideal situation, yes.

> I don't agree that we can simply declare that all functional and partial
> indexes have to be rebuilt between every major version upgrade, which is
> the alternative.

I agree that that's quite impractical, but ...

> There really isn't another option- either we're
> extremely careful and take immutability of functions seriously and make
> sure to preserve behavior, and therefore indexes, across major versions,
> or we don't and we require indexes to be rebuilt.

... I find that position to be uselessly simplistic.  There's a huge range
of gray area here, and all the interesting cases fall in the gray area.
For a couple of quick examples:

* What about bugs?  Should I not have fixed cbdb8b4c0 (wrong answer for
ftoi4(INT_MAX)) on the grounds that doing so might break somebody's index?

* A lot of relevant behavior isn't as much under our control as one
might wish.  For example, recompiling with a new compiler, or different
optimization options, might affect the behavior of roundoff-sensitive
floating point calculations.  It's not practical to forbid people from
doing that, nor to promise that it won't change results.

At the same time, restricting the "immutable" label to things that
seem very unlikely to be affected by any such considerations isn't
a useful position for us to take.  Functional indexes are just too
useful, and the cases where they actually fail are not that frequent.

Certainly, if we make a change that we know is likely to require people to
reindex affected indexes, that should be documented in the release notes.
But I think it's pointless to imagine that we can achieve perfection in
identifying trouble cases, or even to spend significantly more resources
than we do now on trying.  I've not seen many field trouble reports that
seem to trace back to such issues.

But to bring this back to the thread title: pg_config is *intended*
to change results across versions, or even just rebuilds.  There's
no definition of "immutable" under which it should qualify.  At the
same time, there's no obvious use-case for indexing its result, so
that not marking it as immutable doesn't seem to have much downside.

A contrary case that's worth remarking on is to_tsvector and related
functions, which are marked immutable so that they can be indexed,
even though we know darn well that their results depend on changeable
text search configuration parameters.  I remember complaining about
this when the feature was first put in.  Oleg and Teodor responded
that it wouldn't cause big problems, and they were right (so far as
I've heard about).  That's partly because a change in to_tsvector's
behavior might cause text searches to fail to match, but it won't
result in index corruption or anything like that.  So it's important
to consider context and potential bad effects when deciding what we
should allow here.

Maybe the tl;dr version of that is that the immutable/stable/volatile
division is too simplistic and we need to refine it.  But I don't
know what a better design would look like.  Also, I suspect that real
users are already hard put to it to label their functions correctly
in the three-way regime.  Making it more complicated might make things
actively worse, if it increases the odds of functions being mislabeled.

            regards, tom lane


Re: pg_config wrongly marked as not parallel safe?

От
Stephen Frost
Дата:
Greetings,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Certainly, if we make a change that we know is likely to require people to
> reindex affected indexes, that should be documented in the release notes.
> But I think it's pointless to imagine that we can achieve perfection in
> identifying trouble cases, or even to spend significantly more resources
> than we do now on trying.  I've not seen many field trouble reports that
> seem to trace back to such issues.

Just to be clear- I think the way we've been operating is striking a
good balance and that's because people do think about these things and
consider what can happen in indexes when we make changes to immutable
functions.  This thread feels like there's people arguing we should
reduce our level of effort there to a point where we don't care about
on-disk representation in indexes across major versions when immutable
functions are involved and I strongly disagree with that.

> Maybe the tl;dr version of that is that the immutable/stable/volatile
> division is too simplistic and we need to refine it.  But I don't
> know what a better design would look like.  Also, I suspect that real
> users are already hard put to it to label their functions correctly
> in the three-way regime.  Making it more complicated might make things
> actively worse, if it increases the odds of functions being mislabeled.

I haven't got any great ideas about what a better design would look like
either.  Just brainstorming, but perhaps having a flag that basically
says "this can be used in indexes" might be useful as a way to filter
out any immutable functions that shouldn't be used in indexes?  I'm not
sure that we've actually got any of those and I wouldn't want to
arbitrairly limit what users can do without there being a good reason
either.

Maybe such a flag would make people (users and hackers alike) think a
bit more about making changes and if they need to rebuild indexes or
such?  Might also just get in the way though.

Thanks!

Stephen

Вложения

Re: pg_config wrongly marked as not parallel safe?

От
Robert Haas
Дата:
On Wed, Nov 28, 2018 at 9:17 PM Stephen Frost <sfrost@snowman.net> wrote:
> > Actually, some geometric comparisons are performed counting
> > tolerance margin, the validity of which is in doubt.  Their
> > behavior has been changed in recent major version and still has a
> > room for improvement, and the functions are parallel-safe and
> > immutable. Immutablity is mentiond mainly in the light of
> > optimization in the documentation.
>
> I really don't buy off on these arguments in the least.  I also didn't
> say that a function wasn't allowed to change- but that the output of an
> immutable function, for a given input, shouldn't change and in the very
> rare case where we absolutely had to make a change, it had better be for
> a very good reason and we need to consider the impact on user indexes.

I think you're getting a little carried away here. Kyotaro-san's
argument seems extremely strong to me, so much so that I can't really
see how you can argue against it. If we have a bug that says 1 + 1 =
3, we are going to fix it.  We're not going say, oh, well, it's an
immutable function, so we're just going to carry on having it return
the wrong answer. That would be ridiculous.

It feels to me like you are trying to turn this into some kind of
principled stand against evil people who don't care about
immutability, but rumors that the barbarians are at the gates seem
greatly exaggerated from where I sit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pg_config wrongly marked as not parallel safe?

От
Stephen Frost
Дата:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Wed, Nov 28, 2018 at 9:17 PM Stephen Frost <sfrost@snowman.net> wrote:
> > > Actually, some geometric comparisons are performed counting
> > > tolerance margin, the validity of which is in doubt.  Their
> > > behavior has been changed in recent major version and still has a
> > > room for improvement, and the functions are parallel-safe and
> > > immutable. Immutablity is mentiond mainly in the light of
> > > optimization in the documentation.
> >
> > I really don't buy off on these arguments in the least.  I also didn't
> > say that a function wasn't allowed to change- but that the output of an
> > immutable function, for a given input, shouldn't change and in the very
> > rare case where we absolutely had to make a change, it had better be for
> > a very good reason and we need to consider the impact on user indexes.
>
> I think you're getting a little carried away here. Kyotaro-san's
> argument seems extremely strong to me, so much so that I can't really
> see how you can argue against it. If we have a bug that says 1 + 1 =
> 3, we are going to fix it.  We're not going say, oh, well, it's an
> immutable function, so we're just going to carry on having it return
> the wrong answer. That would be ridiculous.
>
> It feels to me like you are trying to turn this into some kind of
> principled stand against evil people who don't care about
> immutability, but rumors that the barbarians are at the gates seem
> greatly exaggerated from where I sit.

I'd suggest you read through the rest of the thread and see my response
to Tom.

Thanks!

Stephen

Вложения

Re: pg_config wrongly marked as not parallel safe?

От
Robert Haas
Дата:
On Thu, Nov 29, 2018 at 3:50 PM Stephen Frost <sfrost@snowman.net> wrote:
> I'd suggest you read through the rest of the thread and see my response
> to Tom.

I already did that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pg_config wrongly marked as not parallel safe?

От
Stephen Frost
Дата:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, Nov 29, 2018 at 3:50 PM Stephen Frost <sfrost@snowman.net> wrote:
> > I'd suggest you read through the rest of the thread and see my response
> > to Tom.
>
> I already did that.

Great, then it's unclear, to me at least, what you're getting at in your
response here.

Just to be clear as to which I was referring to, what I said there was
that I felt the way we've been operating is striking a good balance and
that my concern on this thread was that people appeared to be trying to
move us away from that.  Today, at least, I feel like we are careful and
that we consider the impact on indexes when we're making changes to
immutable functions, and that when we do make such changes, they're for
very good reasons and that we make a note of them in release notes and
possibly go farther, such as considering if we could check for their
usage during pg_upgrade.

Do you feel that's overly aggressive and that we shouldn't care about
the impact of changing immutable functions that could be used in indexes
across major versions?  I don't get the impression that you do from your
prior response, but then it doesn't square with the later comments about
how I'm being carried away and trying to take some principled stand
against evil, so I admit to being confused.

Thanks!

Stephen

Вложения

Re: pg_config wrongly marked as not parallel safe?

От
Robert Haas
Дата:
On Thu, Nov 29, 2018 at 4:01 PM Stephen Frost <sfrost@snowman.net> wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
> > On Thu, Nov 29, 2018 at 3:50 PM Stephen Frost <sfrost@snowman.net> wrote:
> > > I'd suggest you read through the rest of the thread and see my response
> > > to Tom.
> >
> > I already did that.
>
> Great, then it's unclear, to me at least, what you're getting at in your
> response here.

Well, I don't think it was particularly unclear.  You seem to be
attacking Kyotaro-san's position when, in my opinion, he's correct and
you are wrong.  It looks to me like Tom said more or less the same
thing that Kyotaro-san did, just in different words, and you partially
agreed with him, so I don't really know what's going on here, either.

> Just to be clear as to which I was referring to, what I said there was
> that I felt the way we've been operating is striking a good balance and
> that my concern on this thread was that people appeared to be trying to
> move us away from that.  Today, at least, I feel like we are careful and
> that we consider the impact on indexes when we're making changes to
> immutable functions, and that when we do make such changes, they're for
> very good reasons and that we make a note of them in release notes and
> possibly go farther, such as considering if we could check for their
> usage during pg_upgrade.
>
> Do you feel that's overly aggressive and that we shouldn't care about
> the impact of changing immutable functions that could be used in indexes
> across major versions?  I don't get the impression that you do from your
> prior response, but then it doesn't square with the later comments about
> how I'm being carried away and trying to take some principled stand
> against evil, so I admit to being confused.

Generally, I think Andres is wrong to argue that immutability
shouldn't mean *anything* across major versions.  If we can readily
foresee that something is going to change in the future, then we
shouldn't mark it immutable. However:

1. Andres agrees that pg_config() should be marked stable, not
immutable.  If we agree about how the functions we ACTUALLY HAVE
should be marked, it may not be 100% necessary to go on arguing about
what we should do in hypothetical cases.

2. I do not believe that the fact that we have marked something
immutable bars us from changing the behavior of that thing in a new
major release if we decide we don't like the old behavior, as in the
hypothetical 1 + 1 = 3 example I gave before, or the geometric type
behavior Kyotaro-san mentioned, or the ftoi4(INT_MAX) case Tom
mentioned.

2a. Also, while I believe that such changes should be release-noted
like all other changes, I'm not sure we need to do any more than that
unless the change is something that's likely to bite a lot of people.
Since we are unlikely to get 1 + 1 wrong, most of the actual cases
where this sort of thing comes up are going to be corner cases, and
for many users a reindex will be unnecessary even if they pg_upgrade.
 If we make a change that we can foresee is going to cause widespread
breakage, then more vigorous action is appropriate.  *Maybe* it would
be a good idea to somehow mark in the release notes all changes that
might theoretically require a reindex ... but I'm not volunteering to
do the work.

In short, I don't see that there has been any big problem here in the
past, and I don't see anyone making a proposal that would cause a big
problem in the future. There is some mild disagreement about certain
hypothetical situations and corner cases.

Peace,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: pg_config wrongly marked as not parallel safe?

От
Stephen Frost
Дата:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> Generally, I think Andres is wrong to argue that immutability
> shouldn't mean *anything* across major versions.  If we can readily
> foresee that something is going to change in the future, then we
> shouldn't mark it immutable. However:

This sounds like we're in agreement then.

I'll admit that I was responding to multiple comments up-thread in my
response to Kyotaro-san and that it wasn't all intended against the
specific comments brought up there, so, my apologies to Kyotaro-san for
that.  I should have been clearer as to what I was referring to.  By and
large I had viewed it as a furtherance of Andres' position, which was a
mistake on my part.

> In short, I don't see that there has been any big problem here in the
> past, and I don't see anyone making a proposal that would cause a big
> problem in the future. There is some mild disagreement about certain
> hypothetical situations and corner cases.

I don't believe there's been any big problem here in the past and my
concern was what you mention also disagreeing with above- the idea that
immutability shouldn't mean *anything* across major versions.  While I
definitely agree that we should be looking at things on an individual
case basis, the implication that we don't need to worry about
immutability across major versions definitely had me concerned.  I'll
grant that Andres' comments were likely more in passing and not intended
as a proposal, and perhaps should have been glossed over until and
unless an actual proposal or change was put forward.

Thanks!

Stephen

Вложения

Re: pg_config wrongly marked as not parallel safe?

От
Andres Freund
Дата:
Hi,

On 2018-11-29 16:23:42 -0500, Robert Haas wrote:
> Generally, I think Andres is wrong to argue that immutability
> shouldn't mean *anything* across major versions.  If we can readily
> foresee that something is going to change in the future, then we
> shouldn't mark it immutable. However:

I was too glib/brief. All I meant is that we shouldn't take immutable to
*guarantee* anything across major versions. We, of course, shouldn't
break things willy-nilly, and consider the consequences of such
potential breaking changes. Including having to reindex. It's not like
that's only the case for changing immutable functions, the index storage
itself etc also matter.

Greetings,

Andres Freund


Re: pg_config wrongly marked as not parallel safe?

От
Kyotaro HORIGUCHI
Дата:
At Thu, 29 Nov 2018 15:03:00 -0800, Andres Freund <andres@anarazel.de> wrote in
<20181129230300.vkj3csjwk7jt2cfv@alap3.anarazel.de>
> Hi,
> 
> On 2018-11-29 16:23:42 -0500, Robert Haas wrote:
> > Generally, I think Andres is wrong to argue that immutability
> > shouldn't mean *anything* across major versions.  If we can readily
> > foresee that something is going to change in the future, then we
> > shouldn't mark it immutable. However:
> 
> I was too glib/brief. All I meant is that we shouldn't take immutable to
> *guarantee* anything across major versions. We, of course, shouldn't
> break things willy-nilly, and consider the consequences of such
> potential breaking changes. Including having to reindex. It's not like
> that's only the case for changing immutable functions, the index storage
> itself etc also matter.

FWIW, I agree to this.

# And returning to the topic, I vote for pg_config should be "stable".

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: pg_config wrongly marked as not parallel safe?

От
Joe Conway
Дата:
On 11/30/18 3:30 AM, Kyotaro HORIGUCHI wrote:
> # And returning to the topic, I vote for pg_config should be "stable".

And on that note, Does this change does warrant backpatching, or should
be applied to master only?

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Вложения

Re: pg_config wrongly marked as not parallel safe?

От
Stephen Frost
Дата:
Greetings,

* Joe Conway (mail@joeconway.com) wrote:
> On 11/30/18 3:30 AM, Kyotaro HORIGUCHI wrote:
> > # And returning to the topic, I vote for pg_config should be "stable".
>
> And on that note, Does this change does warrant backpatching, or should
> be applied to master only?

Given that it's a catalog change, I would think just master..

Thanks!

Stephen

Вложения

Re: pg_config wrongly marked as not parallel safe?

От
Tom Lane
Дата:
Joe Conway <mail@joeconway.com> writes:
> On 11/30/18 3:30 AM, Kyotaro HORIGUCHI wrote:
>> # And returning to the topic, I vote for pg_config should be "stable".

> And on that note, Does this change does warrant backpatching, or should
> be applied to master only?

I don't think back-patching the catalog change is really a good idea.
The amount of work involved (e.g. release-noting how to perform the
update on existing databases) is way out of proportion to the benefit
for this particular case.

            regards, tom lane


Re: pg_config wrongly marked as not parallel safe?

От
Joe Conway
Дата:
On 11/30/18 10:32 AM, Tom Lane wrote:
> Joe Conway <mail@joeconway.com> writes:
>> On 11/30/18 3:30 AM, Kyotaro HORIGUCHI wrote:
>>> # And returning to the topic, I vote for pg_config should be "stable".
>
>> And on that note, Does this change does warrant backpatching, or should
>> be applied to master only?
>
> I don't think back-patching the catalog change is really a good idea.
> The amount of work involved (e.g. release-noting how to perform the
> update on existing databases) is way out of proportion to the benefit
> for this particular case.


Closing out at least this part of the thread, committed and pushed,
master only.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development


Вложения