Обсуждение: BUG #19034: Recursive function with sql_body can replace an existing function but can not be created on it's own

Поиск
Список
Период
Сортировка
The following bug has been logged on the website:

Bug reference:      19034
Logged by:          Katja Henke
Email address:      katja.henke@foo.ag
PostgreSQL version: 15.3
Operating system:   Linux
Description:

An existing function can be converted into a recursive function that uses
sql_body. But it is not possible to create this same recursive function
using CREATE FUNCTION. In other words: if you managed to have a recursive
function with sql_body in your database, you can't simply restore a dump of
this database.

The following statement results in
ERROR:  42883: function pg_temp.do_something(integer) does not exist

CREATE FUNCTION pg_temp.do_something(int) RETURNS int
LANGUAGE sql
RETURN CASE WHEN $1 % 2 = 1 THEN pg_temp.do_something($1 + 1) ELSE $1 END;

But the combination of the following statements works. It does not matter if
the first function uses sql_body or a string constant.

CREATE FUNCTION pg_temp.do_something(int) RETURNS int
LANGUAGE sql
RETURN 42;

CREATE OR REPLACE FUNCTION pg_temp.do_something(int) RETURNS int
LANGUAGE sql
RETURN CASE WHEN $1 % 2 = 1 THEN pg_temp.do_something($1 + 1) ELSE $1 END;


I ran a test on my local postgresql server, resulting in the same error as described.
However, I wonder about the necessity of supporting recursive functions in the CREATE FUNCTION command alone?
Can someone from the Core team confirm this?

I will try to look into this command, before someone can give a clear clarification.

On Thu, Aug 28, 2025 at 8:18 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      19034
Logged by:          Katja Henke
Email address:      katja.henke@foo.ag
PostgreSQL version: 15.3
Operating system:   Linux
Description:       

An existing function can be converted into a recursive function that uses
sql_body. But it is not possible to create this same recursive function
using CREATE FUNCTION. In other words: if you managed to have a recursive
function with sql_body in your database, you can't simply restore a dump of
this database.

The following statement results in
ERROR:  42883: function pg_temp.do_something(integer) does not exist

CREATE FUNCTION pg_temp.do_something(int) RETURNS int
LANGUAGE sql
RETURN CASE WHEN $1 % 2 = 1 THEN pg_temp.do_something($1 + 1) ELSE $1 END;

But the combination of the following statements works. It does not matter if
the first function uses sql_body or a string constant.

CREATE FUNCTION pg_temp.do_something(int) RETURNS int
LANGUAGE sql
RETURN 42;

CREATE OR REPLACE FUNCTION pg_temp.do_something(int) RETURNS int
LANGUAGE sql
RETURN CASE WHEN $1 % 2 = 1 THEN pg_temp.do_something($1 + 1) ELSE $1 END;

Yushu Chen <gentcys@gmail.com> writes:
> I ran a test on my local postgresql server, resulting in the same error as
> described.
> However, I wonder about the necessity of supporting recursive functions in
> the CREATE FUNCTION command alone?

This is a general property of SQL-standard functions: all objects
referenced in the function body have to exist before creating the
function.  Poking holes in that property would largely defeat the
point of having this type of function.

You could imagine supporting the case by making CREATE FUNCTION act
as though it were
  CREATE FUNCTION foo ... AS $$something dummy$$;
  CREATE OR REPLACE FUNCTION foo ... BEGIN real-body END;
However, that would be a significant amount of work to implement
and it'd about double the cost of doing CREATE FUNCTION.  Note
that this cost would probably have to be incurred for every
SQL-standard function, as I see no good way to detect whether
the body contains a recursive reference in advance of parsing it.

Given that recursive SQL functions are a tiny-minority use case
and there's already a perfectly serviceable way to make them
(ie use an old-style body), I seriously doubt that we'll do
anything about this request.

            regards, tom lane



On Thu, 2025-09-04 at 11:24 -0400, Tom Lane wrote:
> Given that recursive SQL functions are a tiny-minority use case
> and there's already a perfectly serviceable way to make them
> (ie use an old-style body), I seriously doubt that we'll do
> anything about this request.

Sure, but creating a dump that will fail to load is not good.
I don't have a smarter idea that dumping standard SQL functions
in two statements like you suggested...

Yours,
Laurenz Albe



On Friday, September 5, 2025, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Thu, 2025-09-04 at 11:24 -0400, Tom Lane wrote:
> Given that recursive SQL functions are a tiny-minority use case
> and there's already a perfectly serviceable way to make them
> (ie use an old-style body), I seriously doubt that we'll do
> anything about this request.

Sure, but creating a dump that will fail to load is not good.
I don't have a smarter idea that dumping standard SQL functions
in two statements like you suggested...

When resolving the dependency graph for such a function can we prevent the oid of the parent and the oid of the child being the same value?  Prohibit direct self-references so it fails even if you use “or replace”.

David J. 
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Thu, 2025-09-04 at 11:24 -0400, Tom Lane wrote:
>> Given that recursive SQL functions are a tiny-minority use case
>> and there's already a perfectly serviceable way to make them
>> (ie use an old-style body), I seriously doubt that we'll do
>> anything about this request.

> Sure, but creating a dump that will fail to load is not good.

Oh, if you are defining the problem as "pg_dump doesn't cope after
I make this function in two steps" rather than "the server should
allow this to be done in one step", that seems more defensible.

One could expect that that'd let pg_dump also cope with cases
involving mutual recursion between two or more such functions,
which is something we'd certainly not try to make the server
allow without help.

            regards, tom lane




On Fri, 5 Sept 2025 at 16:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Oh, if you are defining the problem as "pg_dump doesn't cope after
I make this function in two steps" rather than "the server should
allow this to be done in one step", that seems more defensible.

One could expect that that'd let pg_dump also cope with cases
involving mutual recursion between two or more such functions,
which is something we'd certainly not try to make the server
allow without help.
 
It seems I was a bit too concise with my bug report. Sorry about this.

What I left out is this:
I have stumbled about this by accident. I had just been refactoring a 
bunch of old sql functions and thought they might go well with sql_body. 
The server didn't complain and all tests were passed until we came to 
"dump and restore".  That would have been a nasty surprise in an emergency.

I don't think it is a good idea to allow the creation of such a function 
one way but not the other. This leads to confusion and unexpected
outcomes. Either allow sql_body with recursion or don't. There is already 
a check if the objects exist that a function want's to  use. So I thought it 
must be easy to also check if a function calls itself. Changing how pg_dump 
copes with this would at least lessen the impact of this situation.

Regards Katja
On Fri, 2025-09-05 at 05:55 -0700, David G. Johnston wrote:
> > Sure, but creating a dump that will fail to load is not good.
> > I don't have a smarter idea that dumping standard SQL functions
> > in two statements like you suggested...
>
> When resolving the dependency graph for such a function can we prevent the
> oid of the parent and the oid of the child being the same value?
> Prohibit direct self-references so it fails even if you use “or replace”.

I am not sure I can follow.  With "parent" and "child", do you mean the
function as it was originally created and the function after replacing
it with a recursive function?  If yes, then that's not an option.
The main point of CREATE OR REPLACE FUNCTION is to preserve the oid.

Yours,
Laurenz Albe



On Sun, Sep 7, 2025, 14:09 Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2025-09-05 at 05:55 -0700, David G. Johnston wrote:
> > Sure, but creating a dump that will fail to load is not good.
> > I don't have a smarter idea that dumping standard SQL functions
> > in two statements like you suggested...
>
> When resolving the dependency graph for such a function can we prevent the
> oid of the parent and the oid of the child being the same value?
> Prohibit direct self-references so it fails even if you use “or replace”.

I am not sure I can follow.  With "parent" and "child", do you mean the
function as it was originally created and the function after replacing
it with a recursive function?  If yes, then that's not an option.
The main point of CREATE OR REPLACE FUNCTION is to preserve the oid.

Right, because the oid is preserved when the create or replace finishes pg_depend must have an entry where objid = refobjid where the oid value is that of the originally created function that is now just being altered.  That situation seems detectable and prohibit-able.

refobjid is generically the parent, objid is generically the child. Dropping parents requires cascade which is how the "normal" type describes the refobjid. (Given that here they are the same I suppose parent/child is needlessly confusing - I should have looked at the columns names the first time.)

David J.

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> Right, because the oid is preserved when the create or replace finishes
> pg_depend must have an entry where objid = refobjid where the oid value is
> that of the originally created function that is now just being altered.
> That situation seems detectable and prohibit-able.

This would not fix the problem for the case of two or more mutually
recursive functions (that is, a() calls b() calls a()).  So I don't
find it to be an attractive answer.

Getting pg_dump to deal with the problem seems more complicated
than I'd first hoped, though:

1. We will need to create two TOC objects, one to represent the
"dummy" CREATE FUNCTION and the other to represent the "CREATE OR
REPLACE" step.  There's really no shortcut that would avoid that,
since the two objects need to be sorted separately by pg_dump's
topological dependency sort.  In otherwise-comparable situations
such as views, there are already two database objects to work with
(e.g., the view's relation and its ON SELECT rule).  Here there are
not, meaning we have only one identifying OID, so it's hard to see
how we'd tell the two TOC objects apart.  And we'd surely not wish
to burden pg_dump with a doppelganger TOC object for every single
function in the database, either, so how do we tell whether two
are needed?  Maybe we can postpone that until we discover the
circular dependency, but I think creating a new TOC object at
that stage would be very messy.

2. Once we do split a function into two TOC objects, we'd have to
figure out which of its dependencies to assign to each object.
The answer is not "they all belong to the second step", because
any user-defined datatypes used in the function's argument list or
result need to still be dependencies of the first step.  (We might
be able to avoid depending on other things like the PL language
and transform types, though: the dummy definition could be made
to always be SQL-language rather than reality.)  Probably pg_dump
can be taught to split the dependencies apart in a valid way, but
it seems a bit messy and outside pg_dump's sphere of knowledge.

(I'm wondering a bit whether this doesn't suggest that we chose
the wrong representation for pg_depend entries for new-style
SQL functions.  Should the backend itself distinguish dependencies
of the function body from those of the function declaration?
Perhaps inventing pg_proc.prosqlbody was the wrong thing and
we should store new-style function bodies in pg_rewrite,
allowing the pg_rewrite OID to be the second OID needed?
It might be too late to change that without breaking a lot
of stuff, though.)

            regards, tom lane



On Sun, Sep 7, 2025, 15:37 Tom Lane <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> Right, because the oid is preserved when the create or replace finishes
> pg_depend must have an entry where objid = refobjid where the oid value is
> that of the originally created function that is now just being altered.
> That situation seems detectable and prohibit-able.

This would not fix the problem for the case of two or more mutually
recursive functions (that is, a() calls b() calls a()).  So I don't
find it to be an attractive answer.

Tack on a modified implementation of the existing detection of circular dependencies among roles to detect circular dependencies among sql_body functions?

While I agree making it must work would be nice given the presence of simple alternatives to use when this functionality is required it's got to be easier to detect and prohibit creation.

David J.



On Sun, 2025-09-07 at 15:50 -0700, David G. Johnston wrote:
> While I agree making it must work would be nice given the presence of simple
> alternatives to use when this functionality is required it's got to be easier
> to detect and prohibit creation.

Perhaps that is the way to go, although it would not make me happy.
But that still leaves the problem of what to do with a database that
already contains recursive SQL functions during pg_dump/pg_restore.

Yours,
Laurenz Albe



On Monday, September 8, 2025, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Sun, 2025-09-07 at 15:50 -0700, David G. Johnston wrote:
> While I agree making it must work would be nice given the presence of simple
> alternatives to use when this functionality is required it's got to be easier
> to detect and prohibit creation.

Perhaps that is the way to go, although it would not make me happy.
But that still leaves the problem of what to do with a database that
already contains recursive SQL functions during pg_dump/pg_restore.

There are a number of ways for a user to setup a database that cannot be restored, where our answer is to fix the source database.  This is one of those cases.  Usually it’s because they do something we’ve documented as prohibited - like mis-identifying a stable/volatile function as immutable and use it in a check constraint.  We just haven’t [yet…] explicitly identified this as prohibited.

Test your restores before you need them for critical reasons.  Dumps are not good backups anyway.  Upgrade will fail and they will need to fix the source.

We have an oversight, but the error is obvious and immediate if the user takes the necessary precautions.  I’m not losing much sleep someone somehow gets bit by this - especially since we are trying to at least make it impossible to be bit by it in the future.

David J.