Обсуждение: Extending outfuncs support to utility statements

Поиск
Список
Период
Сортировка

Extending outfuncs support to utility statements

От
Tom Lane
Дата:
We've long avoided building I/O support for utility-statement node
types, mainly because it didn't seem worth the trouble to write and
maintain such code by hand.  Now that the automatic node-support-code
generation patch is in, that argument is gone, and it's just a matter
of whether the benefits are worth the backend code bloat.  I can
see two benefits worth considering:

* Seems like having such support would be pretty useful for
debugging.

* The only reason struct Query still needs a handwritten output
function is that special logic is needed to prevent trying to
print the utilityStatement field when it's a utility statement
we lack outfuncs support for.  Now it wouldn't be that hard
to get gen_node_support.pl to replicate that special logic,
and if we stick with the status-quo functionality then I think we
should do that so that we can get rid of the handwritten function.
But the other alternative is to provide outfuncs support for all
utility statements and drop the conditionality.

So I looked into how much code are we talking about.  On my
RHEL8 x86_64 machine, the code sizes for outfuncs/readfuncs
as of HEAD are

$ size outfuncs.o readfuncs.o
   text    data     bss     dec     hex filename
 117173       0       0  117173   1c9b5 outfuncs.o
  64540       0       0   64540    fc1c readfuncs.o

If we just open the floodgates and enable both outfuncs and
readfuncs support for all *Stmt nodes (plus some node types
that thereby become dumpable, like AlterTableCmd), then
this becomes

$ size outfuncs.o readfuncs.o
   text    data     bss     dec     hex filename
 139503       0       0  139503   220ef outfuncs.o
  95562       0       0   95562   1754a readfuncs.o

For my taste, the circa 20K growth in outfuncs.o is an okay
price for being able to inspect utility statements more easily.
However, I'm less thrilled with the 30K growth in readfuncs.o,
because I can't see that we'd get any direct benefit from that.
So I think a realistic proposal is to enable outfuncs support
but keep readfuncs disabled.  The attached WIP patch does that,
and gives me these code sizes:

$ size outfuncs.o readfuncs.o
   text    data     bss     dec     hex filename
 139503       0       0  139503   220ef outfuncs.o
  69356       0       0   69356   10eec readfuncs.o

(The extra readfuncs space comes from not troubling over the
subsidiary node types such as AlterTableCmd.  We could run
around and mark those no_read, but I didn't bother yet.)

The support-suppression code in gen_node_support.pl was a crude
hack before, and this patch doesn't make it any less so.
If we go this way, it would be better to move the knowledge that
we're suppressing read functionality into the utility statement
node declarations.  We could just manually label them all
pg_node_attr(no_read), but what I'm kind of tempted to do is
invent a dummy abstract node type like Expr, and make all the
utility statements inherit from it:

typedef struct UtilityStmt
{
    pg_node_attr(abstract, no_read)

    NodeTag        type;
} UtilityStmt;

Thoughts?

            regards, tom lane

diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 4a7902e6bf..fc699c3f19 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -97,21 +97,19 @@ push @scalar_types, qw(QualCost);

 # XXX various things we are not publishing right now to stay level
 # with the manual system
-push @no_copy,  qw(CallContext InlineCodeBlock);
-push @no_equal, qw(CallContext InlineCodeBlock);
-push @no_read_write,
-  qw(AccessPriv AlterTableCmd CallContext CreateOpClassItem FunctionParameter InferClause InlineCodeBlock
ObjectWithArgsOnConflictClause PartitionCmd RoleSpec VacuumRelation); 
-push @no_read, qw(A_ArrayExpr A_Indices A_Indirection AlterStatsStmt
-  CollateClause ColumnDef ColumnRef CreateForeignTableStmt CreateStatsStmt
-  CreateStmt FuncCall ImportForeignSchemaStmt IndexElem IndexStmt
+push @no_copy,       qw(CallContext InlineCodeBlock);
+push @no_equal,      qw(CallContext InlineCodeBlock);
+push @no_read_write, qw(CallContext InlineCodeBlock);
+push @no_read,       qw(A_ArrayExpr A_Indices A_Indirection
+  CollateClause ColumnDef ColumnRef FuncCall IndexElem
   JsonAggConstructor JsonArgument JsonArrayAgg JsonArrayConstructor
   JsonArrayQueryConstructor JsonCommon JsonFuncExpr JsonKeyValue
   JsonObjectAgg JsonObjectConstructor JsonOutput JsonParseExpr JsonScalarExpr
   JsonSerializeExpr JsonTable JsonTableColumn JsonTablePlan LockingClause
-  MultiAssignRef PLAssignStmt ParamRef PartitionElem PartitionSpec
+  MultiAssignRef ParamRef PartitionElem PartitionSpec
   PlaceHolderVar PublicationObjSpec PublicationTable RangeFunction
-  RangeSubselect RangeTableFunc RangeTableFuncCol RangeTableSample RawStmt
-  ResTarget ReturnStmt SelectStmt SortBy StatsElem TableLikeClause
+  RangeSubselect RangeTableFunc RangeTableFuncCol RangeTableSample
+  ResTarget SortBy StatsElem TableLikeClause
   TriggerTransition TypeCast TypeName WindowDef WithClause XmlSerialize);


@@ -693,16 +691,16 @@ foreach my $n (@node_types)
     next if elem $n, @abstract_types;
     next if elem $n, @no_read_write;

-    # XXX For now, skip all "Stmt"s except that ones that were there before.
+    my $no_read = (elem $n, @no_read);
+
+    # Disable read support for utility statements.
+    # XXX this is a very ugly way to do it.
     if ($n =~ /Stmt$/)
     {
-        my @keep =
-          qw(AlterStatsStmt CreateForeignTableStmt CreateStatsStmt CreateStmt DeclareCursorStmt
ImportForeignSchemaStmtIndexStmt NotifyStmt PlannedStmt PLAssignStmt RawStmt ReturnStmt SelectStmt SetOperationStmt); 
-        next unless elem $n, @keep;
+        my @keep = qw(NotifyStmt PlannedStmt SetOperationStmt);
+        $no_read = 1 unless elem $n, @keep;
     }

-    my $no_read = (elem $n, @no_read);
-
     # output format starts with upper case node type name
     my $N = uc $n;

@@ -725,13 +723,20 @@ _out${n}(StringInfo str, const $n *node)

 ";

-    print $rff "
+    if (!$no_read)
+    {
+        my $macro =
+          (@{ $node_type_info{$n}->{fields} } > 0)
+          ? 'READ_LOCALS'
+          : 'READ_LOCALS_NO_FIELDS';
+        print $rff "
 static $n *
 _read${n}(void)
 {
-\tREAD_LOCALS($n);
+\t$macro($n);

-" unless $no_read;
+";
+    }

     # print instructions for each field
     foreach my $f (@{ $node_type_info{$n}->{fields} })
@@ -781,6 +786,7 @@ _read${n}(void)
             print $rff "\tREAD_LOCATION_FIELD($f);\n" unless $no_read;
         }
         elsif ($t eq 'int'
+            || $t eq 'int16'
             || $t eq 'int32'
             || $t eq 'AttrNumber'
             || $t eq 'StrategyNumber')
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4d776e7b51..85ac0b5e21 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -415,79 +415,6 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
     methods->nodeOut(str, node);
 }

-static void
-_outQuery(StringInfo str, const Query *node)
-{
-    WRITE_NODE_TYPE("QUERY");
-
-    WRITE_ENUM_FIELD(commandType, CmdType);
-    WRITE_ENUM_FIELD(querySource, QuerySource);
-    /* we intentionally do not print the queryId field */
-    WRITE_BOOL_FIELD(canSetTag);
-
-    /*
-     * Hack to work around missing outfuncs routines for a lot of the
-     * utility-statement node types.  (The only one we actually *need* for
-     * rules support is NotifyStmt.)  Someday we ought to support 'em all, but
-     * for the meantime do this to avoid getting lots of warnings when running
-     * with debug_print_parse on.
-     */
-    if (node->utilityStmt)
-    {
-        switch (nodeTag(node->utilityStmt))
-        {
-            case T_CreateStmt:
-            case T_IndexStmt:
-            case T_NotifyStmt:
-            case T_DeclareCursorStmt:
-                WRITE_NODE_FIELD(utilityStmt);
-                break;
-            default:
-                appendStringInfoString(str, " :utilityStmt ?");
-                break;
-        }
-    }
-    else
-        appendStringInfoString(str, " :utilityStmt <>");
-
-    WRITE_INT_FIELD(resultRelation);
-    WRITE_BOOL_FIELD(hasAggs);
-    WRITE_BOOL_FIELD(hasWindowFuncs);
-    WRITE_BOOL_FIELD(hasTargetSRFs);
-    WRITE_BOOL_FIELD(hasSubLinks);
-    WRITE_BOOL_FIELD(hasDistinctOn);
-    WRITE_BOOL_FIELD(hasRecursive);
-    WRITE_BOOL_FIELD(hasModifyingCTE);
-    WRITE_BOOL_FIELD(hasForUpdate);
-    WRITE_BOOL_FIELD(hasRowSecurity);
-    WRITE_BOOL_FIELD(isReturn);
-    WRITE_NODE_FIELD(cteList);
-    WRITE_NODE_FIELD(rtable);
-    WRITE_NODE_FIELD(jointree);
-    WRITE_NODE_FIELD(targetList);
-    WRITE_ENUM_FIELD(override, OverridingKind);
-    WRITE_NODE_FIELD(onConflict);
-    WRITE_NODE_FIELD(returningList);
-    WRITE_NODE_FIELD(groupClause);
-    WRITE_BOOL_FIELD(groupDistinct);
-    WRITE_NODE_FIELD(groupingSets);
-    WRITE_NODE_FIELD(havingQual);
-    WRITE_NODE_FIELD(windowClause);
-    WRITE_NODE_FIELD(distinctClause);
-    WRITE_NODE_FIELD(sortClause);
-    WRITE_NODE_FIELD(limitOffset);
-    WRITE_NODE_FIELD(limitCount);
-    WRITE_ENUM_FIELD(limitOption, LimitOption);
-    WRITE_NODE_FIELD(rowMarks);
-    WRITE_NODE_FIELD(setOperations);
-    WRITE_NODE_FIELD(constraintDeps);
-    WRITE_NODE_FIELD(withCheckOptions);
-    WRITE_NODE_FIELD(mergeActionList);
-    WRITE_BOOL_FIELD(mergeUseOuterJoin);
-    WRITE_LOCATION_FIELD(stmt_location);
-    WRITE_INT_FIELD(stmt_len);
-}
-
 static void
 _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
 {
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1421686938..a2391280be 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -240,56 +240,6 @@ readBitmapset(void)
  * special_read_write attribute
  */

-static Query *
-_readQuery(void)
-{
-    READ_LOCALS(Query);
-
-    READ_ENUM_FIELD(commandType, CmdType);
-    READ_ENUM_FIELD(querySource, QuerySource);
-    local_node->queryId = UINT64CONST(0);    /* not saved in output format */
-    READ_BOOL_FIELD(canSetTag);
-    READ_NODE_FIELD(utilityStmt);
-    READ_INT_FIELD(resultRelation);
-    READ_BOOL_FIELD(hasAggs);
-    READ_BOOL_FIELD(hasWindowFuncs);
-    READ_BOOL_FIELD(hasTargetSRFs);
-    READ_BOOL_FIELD(hasSubLinks);
-    READ_BOOL_FIELD(hasDistinctOn);
-    READ_BOOL_FIELD(hasRecursive);
-    READ_BOOL_FIELD(hasModifyingCTE);
-    READ_BOOL_FIELD(hasForUpdate);
-    READ_BOOL_FIELD(hasRowSecurity);
-    READ_BOOL_FIELD(isReturn);
-    READ_NODE_FIELD(cteList);
-    READ_NODE_FIELD(rtable);
-    READ_NODE_FIELD(jointree);
-    READ_NODE_FIELD(targetList);
-    READ_ENUM_FIELD(override, OverridingKind);
-    READ_NODE_FIELD(onConflict);
-    READ_NODE_FIELD(returningList);
-    READ_NODE_FIELD(groupClause);
-    READ_BOOL_FIELD(groupDistinct);
-    READ_NODE_FIELD(groupingSets);
-    READ_NODE_FIELD(havingQual);
-    READ_NODE_FIELD(windowClause);
-    READ_NODE_FIELD(distinctClause);
-    READ_NODE_FIELD(sortClause);
-    READ_NODE_FIELD(limitOffset);
-    READ_NODE_FIELD(limitCount);
-    READ_ENUM_FIELD(limitOption, LimitOption);
-    READ_NODE_FIELD(rowMarks);
-    READ_NODE_FIELD(setOperations);
-    READ_NODE_FIELD(constraintDeps);
-    READ_NODE_FIELD(withCheckOptions);
-    READ_NODE_FIELD(mergeActionList);
-    READ_BOOL_FIELD(mergeUseOuterJoin);
-    READ_LOCATION_FIELD(stmt_location);
-    READ_INT_FIELD(stmt_len);
-
-    READ_DONE();
-}
-
 static Const *
 _readConst(void)
 {
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 0b6a7bb365..617ba619de 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -117,8 +117,6 @@ typedef uint32 AclMode;            /* a bitmask of privilege bits */
  */
 typedef struct Query
 {
-    pg_node_attr(custom_read_write)
-
     NodeTag        type;

     CmdType        commandType;    /* select|insert|update|delete|merge|utility */
@@ -126,10 +124,10 @@ typedef struct Query
     QuerySource querySource;    /* where did I come from? */

     /*
-     * query identifier (can be set by plugins); ignored for equal, might not
-     * be set
+     * query identifier (can be set by plugins); ignored for equal, as it
+     * might not be set; also not stored
      */
-    uint64        queryId pg_node_attr(equal_ignore, read_as(0));
+    uint64        queryId pg_node_attr(equal_ignore, read_write_ignore, read_as(0));

     bool        canSetTag;        /* do I set the command result tag? */


Re: Extending outfuncs support to utility statements

От
Andres Freund
Дата:
Hi,

On 2022-07-09 18:20:26 -0400, Tom Lane wrote:
> We've long avoided building I/O support for utility-statement node
> types, mainly because it didn't seem worth the trouble to write and
> maintain such code by hand.  Now that the automatic node-support-code
> generation patch is in, that argument is gone, and it's just a matter
> of whether the benefits are worth the backend code bloat.  I can
> see two benefits worth considering:
> 
> * Seems like having such support would be pretty useful for
> debugging.

Agreed.


> So I looked into how much code are we talking about.  On my
> RHEL8 x86_64 machine, the code sizes for outfuncs/readfuncs
> as of HEAD are
> 
> $ size outfuncs.o readfuncs.o
>    text    data     bss     dec     hex filename
>  117173       0       0  117173   1c9b5 outfuncs.o
>   64540       0       0   64540    fc1c readfuncs.o
> 
> If we just open the floodgates and enable both outfuncs and
> readfuncs support for all *Stmt nodes (plus some node types
> that thereby become dumpable, like AlterTableCmd), then
> this becomes
> 
> $ size outfuncs.o readfuncs.o
>    text    data     bss     dec     hex filename
>  139503       0       0  139503   220ef outfuncs.o
>   95562       0       0   95562   1754a readfuncs.o
> 
> For my taste, the circa 20K growth in outfuncs.o is an okay
> price for being able to inspect utility statements more easily.
> However, I'm less thrilled with the 30K growth in readfuncs.o,
> because I can't see that we'd get any direct benefit from that.
> So I think a realistic proposal is to enable outfuncs support
> but keep readfuncs disabled.

Another approach could be to mark those paths as "cold", so they are placed
further away, reducing / removing potential overhead due to higher iTLB misses
etc. 30K of disk space isn't worth worrying about.

Don't really have an opinion on this.

Greetings,

Andres Freund



Re: Extending outfuncs support to utility statements

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2022-07-09 18:20:26 -0400, Tom Lane wrote:
>> For my taste, the circa 20K growth in outfuncs.o is an okay
>> price for being able to inspect utility statements more easily.
>> However, I'm less thrilled with the 30K growth in readfuncs.o,
>> because I can't see that we'd get any direct benefit from that.
>> So I think a realistic proposal is to enable outfuncs support
>> but keep readfuncs disabled.

> Another approach could be to mark those paths as "cold", so they are placed
> further away, reducing / removing potential overhead due to higher iTLB misses
> etc. 30K of disk space isn't worth worrying about.

They're not so much "cold" as "dead", so I don't see the point
of having them at all.  If we ever start allowing utility commands
(besides NOTIFY) in stored rules, we'd need readfuncs support then
... but at least in the short run I don't see that happening.

            regards, tom lane



Re: Extending outfuncs support to utility statements

От
Andres Freund
Дата:
Hi,

On 2022-07-10 19:12:52 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-07-09 18:20:26 -0400, Tom Lane wrote:
> >> For my taste, the circa 20K growth in outfuncs.o is an okay
> >> price for being able to inspect utility statements more easily.
> >> However, I'm less thrilled with the 30K growth in readfuncs.o,
> >> because I can't see that we'd get any direct benefit from that.
> >> So I think a realistic proposal is to enable outfuncs support
> >> but keep readfuncs disabled.
> 
> > Another approach could be to mark those paths as "cold", so they are placed
> > further away, reducing / removing potential overhead due to higher iTLB misses
> > etc. 30K of disk space isn't worth worrying about.
> 
> They're not so much "cold" as "dead", so I don't see the point
> of having them at all.  If we ever start allowing utility commands
> (besides NOTIFY) in stored rules, we'd need readfuncs support then
> ... but at least in the short run I don't see that happening.

It would allow us to test utility outfuncs as part of the
WRITE_READ_PARSE_PLAN_TREES check. Not that that's worth very much.

I guess it could be a minor help in making a few more utility commands benefit
from paralellism?

Anyway, as mentioned earlier, I'm perfectly fine not supporting readfuns for
utility statements for now.

Greetings,

Andres Freund



Re: Extending outfuncs support to utility statements

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2022-07-10 19:12:52 -0400, Tom Lane wrote:
>> They're not so much "cold" as "dead", so I don't see the point
>> of having them at all.  If we ever start allowing utility commands
>> (besides NOTIFY) in stored rules, we'd need readfuncs support then
>> ... but at least in the short run I don't see that happening.

> It would allow us to test utility outfuncs as part of the
> WRITE_READ_PARSE_PLAN_TREES check. Not that that's worth very much.

Especially now that those are all auto-generated anyway.

> I guess it could be a minor help in making a few more utility commands benefit
> from paralellism?

Again, once we have an actual use-case, enabling that code will be
fine by me.  But we don't yet.

            regards, tom lane



Re: Extending outfuncs support to utility statements

От
Tatsuo Ishii
Дата:
Hi,

Now we are ready to have debug_print_raw_parse (or something like
that)?  Pgpool-II has been importing and using PostgreSQL's raw
parser for years. I think it would be great for PostgreSQL and
Pgpool-II developers to have such a feature.

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Extending outfuncs support to utility statements

От
Peter Eisentraut
Дата:
On 10.07.22 00:20, Tom Lane wrote:
> We've long avoided building I/O support for utility-statement node
> types, mainly because it didn't seem worth the trouble to write and
> maintain such code by hand.  Now that the automatic node-support-code
> generation patch is in, that argument is gone, and it's just a matter
> of whether the benefits are worth the backend code bloat.  I can
> see two benefits worth considering:

This is also needed to be able to store utility statements in (unquoted) 
SQL function bodies.  I have some in-progress code for that that I need 
to dust off.  IIRC, there are still some nontrivial issues to work 
through on the reading side.  I don't have a problem with enabling the 
outfuncs side in the meantime.



Re: Extending outfuncs support to utility statements

От
Tom Lane
Дата:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 10.07.22 00:20, Tom Lane wrote:
>> We've long avoided building I/O support for utility-statement node
>> types, mainly because it didn't seem worth the trouble to write and
>> maintain such code by hand.
k
> This is also needed to be able to store utility statements in (unquoted) 
> SQL function bodies.  I have some in-progress code for that that I need 
> to dust off.  IIRC, there are still some nontrivial issues to work 
> through on the reading side.  I don't have a problem with enabling the 
> outfuncs side in the meantime.

Oh!  I'd not thought of that, but yes that is a plausible near-term
requirement for readfuncs support for utility statements.  So my
concern about suppressing those is largely a waste of effort.

There might be enough node types that are raw-parse-tree-only,
but not involved in utility statements, to make it worth
continuing to suppress readfuncs support for them.  But I kinda
doubt it.  I'll try to get some numbers later today.

            regards, tom lane



Re: Extending outfuncs support to utility statements

От
Tom Lane
Дата:
I wrote:
> There might be enough node types that are raw-parse-tree-only,
> but not involved in utility statements, to make it worth
> continuing to suppress readfuncs support for them.  But I kinda
> doubt it.  I'll try to get some numbers later today.

Granting that we want write/read support for utility statements,
it seems that what we can save by suppressing raw-parse-tree-only
nodes is only about 10kB.  That's clearly not worth troubling over
in the grand scheme of things, so I suggest that we just open the
floodgates as attached.

            regards, tom lane

diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index 7694e04d0a..96af17516a 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -166,23 +166,6 @@ push @scalar_types, qw(EquivalenceClass* EquivalenceMember*);
 # currently not required.
 push @scalar_types, qw(QualCost);

-# XXX various things we are not publishing right now to stay level
-# with the manual system
-push @no_read_write,
-  qw(AccessPriv AlterTableCmd CreateOpClassItem FunctionParameter InferClause ObjectWithArgs OnConflictClause
PartitionCmdRoleSpec VacuumRelation); 
-push @no_read, qw(A_ArrayExpr A_Indices A_Indirection AlterStatsStmt
-  CollateClause ColumnDef ColumnRef CreateForeignTableStmt CreateStatsStmt
-  CreateStmt FuncCall ImportForeignSchemaStmt IndexElem IndexStmt
-  JsonAggConstructor JsonArgument JsonArrayAgg JsonArrayConstructor
-  JsonArrayQueryConstructor JsonCommon JsonFuncExpr JsonKeyValue
-  JsonObjectAgg JsonObjectConstructor JsonOutput JsonParseExpr JsonScalarExpr
-  JsonSerializeExpr JsonTable JsonTableColumn JsonTablePlan LockingClause
-  MultiAssignRef PLAssignStmt ParamRef PartitionElem PartitionSpec
-  PlaceHolderVar PublicationObjSpec PublicationTable RangeFunction
-  RangeSubselect RangeTableFunc RangeTableFuncCol RangeTableSample RawStmt
-  ResTarget ReturnStmt SelectStmt SortBy StatsElem TableLikeClause
-  TriggerTransition TypeCast TypeName WindowDef WithClause XmlSerialize);
-

 ## check that we have the expected number of files on the command line
 die "wrong number of input files, expected @all_input_files\n"
@@ -795,14 +778,6 @@ foreach my $n (@node_types)
     next if elem $n, @nodetag_only;
     next if elem $n, @no_read_write;

-    # XXX For now, skip all "Stmt"s except that ones that were there before.
-    if ($n =~ /Stmt$/)
-    {
-        my @keep =
-          qw(AlterStatsStmt CreateForeignTableStmt CreateStatsStmt CreateStmt DeclareCursorStmt
ImportForeignSchemaStmtIndexStmt NotifyStmt PlannedStmt PLAssignStmt RawStmt ReturnStmt SelectStmt SetOperationStmt); 
-        next unless elem $n, @keep;
-    }
-
     my $no_read = (elem $n, @no_read);

     # output format starts with upper case node type name
@@ -827,13 +802,20 @@ _out${n}(StringInfo str, const $n *node)

 ";

-    print $rff "
+    if (!$no_read)
+    {
+        my $macro =
+          (@{ $node_type_info{$n}->{fields} } > 0)
+          ? 'READ_LOCALS'
+          : 'READ_LOCALS_NO_FIELDS';
+        print $rff "
 static $n *
 _read${n}(void)
 {
-\tREAD_LOCALS($n);
+\t$macro($n);

-" unless $no_read;
+";
+    }

     # print instructions for each field
     foreach my $f (@{ $node_type_info{$n}->{fields} })
@@ -883,6 +865,7 @@ _read${n}(void)
             print $rff "\tREAD_LOCATION_FIELD($f);\n" unless $no_read;
         }
         elsif ($t eq 'int'
+            || $t eq 'int16'
             || $t eq 'int32'
             || $t eq 'AttrNumber'
             || $t eq 'StrategyNumber')
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 4d776e7b51..85ac0b5e21 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -415,79 +415,6 @@ _outExtensibleNode(StringInfo str, const ExtensibleNode *node)
     methods->nodeOut(str, node);
 }

-static void
-_outQuery(StringInfo str, const Query *node)
-{
-    WRITE_NODE_TYPE("QUERY");
-
-    WRITE_ENUM_FIELD(commandType, CmdType);
-    WRITE_ENUM_FIELD(querySource, QuerySource);
-    /* we intentionally do not print the queryId field */
-    WRITE_BOOL_FIELD(canSetTag);
-
-    /*
-     * Hack to work around missing outfuncs routines for a lot of the
-     * utility-statement node types.  (The only one we actually *need* for
-     * rules support is NotifyStmt.)  Someday we ought to support 'em all, but
-     * for the meantime do this to avoid getting lots of warnings when running
-     * with debug_print_parse on.
-     */
-    if (node->utilityStmt)
-    {
-        switch (nodeTag(node->utilityStmt))
-        {
-            case T_CreateStmt:
-            case T_IndexStmt:
-            case T_NotifyStmt:
-            case T_DeclareCursorStmt:
-                WRITE_NODE_FIELD(utilityStmt);
-                break;
-            default:
-                appendStringInfoString(str, " :utilityStmt ?");
-                break;
-        }
-    }
-    else
-        appendStringInfoString(str, " :utilityStmt <>");
-
-    WRITE_INT_FIELD(resultRelation);
-    WRITE_BOOL_FIELD(hasAggs);
-    WRITE_BOOL_FIELD(hasWindowFuncs);
-    WRITE_BOOL_FIELD(hasTargetSRFs);
-    WRITE_BOOL_FIELD(hasSubLinks);
-    WRITE_BOOL_FIELD(hasDistinctOn);
-    WRITE_BOOL_FIELD(hasRecursive);
-    WRITE_BOOL_FIELD(hasModifyingCTE);
-    WRITE_BOOL_FIELD(hasForUpdate);
-    WRITE_BOOL_FIELD(hasRowSecurity);
-    WRITE_BOOL_FIELD(isReturn);
-    WRITE_NODE_FIELD(cteList);
-    WRITE_NODE_FIELD(rtable);
-    WRITE_NODE_FIELD(jointree);
-    WRITE_NODE_FIELD(targetList);
-    WRITE_ENUM_FIELD(override, OverridingKind);
-    WRITE_NODE_FIELD(onConflict);
-    WRITE_NODE_FIELD(returningList);
-    WRITE_NODE_FIELD(groupClause);
-    WRITE_BOOL_FIELD(groupDistinct);
-    WRITE_NODE_FIELD(groupingSets);
-    WRITE_NODE_FIELD(havingQual);
-    WRITE_NODE_FIELD(windowClause);
-    WRITE_NODE_FIELD(distinctClause);
-    WRITE_NODE_FIELD(sortClause);
-    WRITE_NODE_FIELD(limitOffset);
-    WRITE_NODE_FIELD(limitCount);
-    WRITE_ENUM_FIELD(limitOption, LimitOption);
-    WRITE_NODE_FIELD(rowMarks);
-    WRITE_NODE_FIELD(setOperations);
-    WRITE_NODE_FIELD(constraintDeps);
-    WRITE_NODE_FIELD(withCheckOptions);
-    WRITE_NODE_FIELD(mergeActionList);
-    WRITE_BOOL_FIELD(mergeUseOuterJoin);
-    WRITE_LOCATION_FIELD(stmt_location);
-    WRITE_INT_FIELD(stmt_len);
-}
-
 static void
 _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
 {
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 1421686938..a2391280be 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -240,56 +240,6 @@ readBitmapset(void)
  * special_read_write attribute
  */

-static Query *
-_readQuery(void)
-{
-    READ_LOCALS(Query);
-
-    READ_ENUM_FIELD(commandType, CmdType);
-    READ_ENUM_FIELD(querySource, QuerySource);
-    local_node->queryId = UINT64CONST(0);    /* not saved in output format */
-    READ_BOOL_FIELD(canSetTag);
-    READ_NODE_FIELD(utilityStmt);
-    READ_INT_FIELD(resultRelation);
-    READ_BOOL_FIELD(hasAggs);
-    READ_BOOL_FIELD(hasWindowFuncs);
-    READ_BOOL_FIELD(hasTargetSRFs);
-    READ_BOOL_FIELD(hasSubLinks);
-    READ_BOOL_FIELD(hasDistinctOn);
-    READ_BOOL_FIELD(hasRecursive);
-    READ_BOOL_FIELD(hasModifyingCTE);
-    READ_BOOL_FIELD(hasForUpdate);
-    READ_BOOL_FIELD(hasRowSecurity);
-    READ_BOOL_FIELD(isReturn);
-    READ_NODE_FIELD(cteList);
-    READ_NODE_FIELD(rtable);
-    READ_NODE_FIELD(jointree);
-    READ_NODE_FIELD(targetList);
-    READ_ENUM_FIELD(override, OverridingKind);
-    READ_NODE_FIELD(onConflict);
-    READ_NODE_FIELD(returningList);
-    READ_NODE_FIELD(groupClause);
-    READ_BOOL_FIELD(groupDistinct);
-    READ_NODE_FIELD(groupingSets);
-    READ_NODE_FIELD(havingQual);
-    READ_NODE_FIELD(windowClause);
-    READ_NODE_FIELD(distinctClause);
-    READ_NODE_FIELD(sortClause);
-    READ_NODE_FIELD(limitOffset);
-    READ_NODE_FIELD(limitCount);
-    READ_ENUM_FIELD(limitOption, LimitOption);
-    READ_NODE_FIELD(rowMarks);
-    READ_NODE_FIELD(setOperations);
-    READ_NODE_FIELD(constraintDeps);
-    READ_NODE_FIELD(withCheckOptions);
-    READ_NODE_FIELD(mergeActionList);
-    READ_BOOL_FIELD(mergeUseOuterJoin);
-    READ_LOCATION_FIELD(stmt_location);
-    READ_INT_FIELD(stmt_len);
-
-    READ_DONE();
-}
-
 static Const *
 _readConst(void)
 {
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index e2ad761768..29e48642b7 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -117,8 +117,6 @@ typedef uint32 AclMode;            /* a bitmask of privilege bits */
  */
 typedef struct Query
 {
-    pg_node_attr(custom_read_write)
-
     NodeTag        type;

     CmdType        commandType;    /* select|insert|update|delete|merge|utility */
@@ -126,10 +124,10 @@ typedef struct Query
     QuerySource querySource;    /* where did I come from? */

     /*
-     * query identifier (can be set by plugins); ignored for equal, might not
-     * be set
+     * query identifier (can be set by plugins); ignored for equal, as it
+     * might not be set; also not stored
      */
-    uint64        queryId pg_node_attr(equal_ignore, read_as(0));
+    uint64        queryId pg_node_attr(equal_ignore, read_write_ignore, read_as(0));

     bool        canSetTag;        /* do I set the command result tag? */


Re: Extending outfuncs support to utility statements

От
Tom Lane
Дата:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> This is also needed to be able to store utility statements in (unquoted) 
> SQL function bodies.  I have some in-progress code for that that I need 
> to dust off.  IIRC, there are still some nontrivial issues to work 
> through on the reading side.  I don't have a problem with enabling the 
> outfuncs side in the meantime.

BTW, I experimented with trying to enable WRITE_READ_PARSE_PLAN_TREES
for utility statements, and found that the immediate problem is that
Constraint and a couple of other node types lack read functions
(they're the ones marked "custom_read_write, no_read" in parsenodes.h).
They have out functions, so writing the inverses seems like it's just
something nobody ever got around to.  Perhaps there are deeper problems
lurking behind that one, though.

            regards, tom lane



Re: Extending outfuncs support to utility statements

От
Peter Eisentraut
Дата:
On 13.07.22 00:38, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
>> This is also needed to be able to store utility statements in (unquoted)
>> SQL function bodies.  I have some in-progress code for that that I need
>> to dust off.  IIRC, there are still some nontrivial issues to work
>> through on the reading side.  I don't have a problem with enabling the
>> outfuncs side in the meantime.
> 
> BTW, I experimented with trying to enable WRITE_READ_PARSE_PLAN_TREES
> for utility statements, and found that the immediate problem is that
> Constraint and a couple of other node types lack read functions
> (they're the ones marked "custom_read_write, no_read" in parsenodes.h).
> They have out functions, so writing the inverses seems like it's just
> something nobody ever got around to.  Perhaps there are deeper problems
> lurking behind that one, though.

Here are patches for that.

v1-0001-Fix-reading-of-most-negative-integer-value-nodes.patch
v1-0002-Fix-reading-of-BitString-nodes.patch

These are some of those lurking problems.

v1-0003-Add-read-support-for-some-missing-raw-parse-nodes.patch

This adds the read support for the missing nodes.

The above patches are candidates for committing.

At this point we have one structural problem left: char * node fields 
output with WRITE_STRING_FIELD() (ultimately outToken()) don't 
distinguish between empty strings and NULL values.  A write/read 
roundtrip ends up as NULL for an empty string.  This shows up in the 
regression tests for commands such as

CREATE TABLESPACE regress_tblspace LOCATION '';
CREATE SUBSCRIPTION regress_addr_sub CONNECTION '' ...

This will need some expansion of the output format to handle this.

v1-0004-XXX-Turn-on-WRITE_READ_PARSE_PLAN_TREES-for-testi.patch
v1-0005-Implement-WRITE_READ_PARSE_PLAN_TREES-for-raw-par.patch
v1-0006-Enable-WRITE_READ_PARSE_PLAN_TREES-of-rewritten-u.patch

This is for testing the above.  Note that in 0005 we need some special 
handling for float values to preserve the full precision across 
write/read.  I suppose this could be unified with the code the preserves 
the location fields when doing write/read checking.

v1-0007-Enable-utility-statements-in-unquoted-SQL-functio.patch

This demonstrates what the ultimate goal is.  A few more tests should be 
added eventually.
Вложения

Re: Extending outfuncs support to utility statements

От
Andres Freund
Дата:
Hi,

On 2022-08-24 17:25:31 +0200, Peter Eisentraut wrote:
> Here are patches for that.

These patches have been failing since they were posted, afaict:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/39/3848

I assume that's known? Most of the failures seem to be things like
diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/tablespace.out
/tmp/cirrus-ci-build/build/testrun/main/regress/results/tablespace.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/tablespace.out    2022-09-22 12:30:07.340655000 +0000
+++ /tmp/cirrus-ci-build/build/testrun/main/regress/results/tablespace.out    2022-09-22 12:35:15.075825000 +0000
@@ -3,6 +3,8 @@
 ERROR:  tablespace location must be an absolute path
 -- empty tablespace locations are not usually allowed
 CREATE TABLESPACE regress_tblspace LOCATION ''; -- fail
+WARNING:  outfuncs/readfuncs failed to produce an equal raw parse tree
+WARNING:  outfuncs/readfuncs failed to produce an equal rewritten parse tree
 ERROR:  tablespace location must be an absolute path
 -- as a special developer-only option to allow us to use tablespaces
 -- with streaming replication on the same server, an empty location

Greetings,

Andres Freund



Re: Extending outfuncs support to utility statements

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2022-08-24 17:25:31 +0200, Peter Eisentraut wrote:
>> Here are patches for that.

> These patches have been failing since they were posted, afaict:
> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/39/3848

> I assume that's known?

I think this is the issue Peter mentioned about needing to distinguish
between empty strings and NULL strings.  We're going to need to rethink
the behavior of pg_strtok() a bit to fix that.

            regards, tom lane



Re: Extending outfuncs support to utility statements

От
Tom Lane
Дата:
I wrote:
> I think this is the issue Peter mentioned about needing to distinguish
> between empty strings and NULL strings.  We're going to need to rethink
> the behavior of pg_strtok() a bit to fix that.

After staring at the code a bit, I think we don't need to touch
pg_strtok() per se.  I propose that this can be resolved with changes
at the next higher level.  Let's make outToken print NULL as <> as
it always has, but print an empty string as "" (two double quotes).
If the raw input string is two double quotes, print it as \"" to
disambiguate.  This'd require a catversion bump when committed,
but I don't think there are any showstopper problems otherwise.

I'll work on fleshing that idea out.

            regards, tom lane



Re: Extending outfuncs support to utility statements

От
Andres Freund
Дата:
On 2022-09-22 12:48:47 -0400, Tom Lane wrote:
> I wrote:
> > I think this is the issue Peter mentioned about needing to distinguish
> > between empty strings and NULL strings.  We're going to need to rethink
> > the behavior of pg_strtok() a bit to fix that.
> 
> After staring at the code a bit, I think we don't need to touch
> pg_strtok() per se.  I propose that this can be resolved with changes
> at the next higher level.  Let's make outToken print NULL as <> as
> it always has, but print an empty string as "" (two double quotes).
> If the raw input string is two double quotes, print it as \"" to
> disambiguate.  This'd require a catversion bump when committed,
> but I don't think there are any showstopper problems otherwise.

Makes sense to me.



Re: Extending outfuncs support to utility statements

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2022-09-22 12:48:47 -0400, Tom Lane wrote:
>> After staring at the code a bit, I think we don't need to touch
>> pg_strtok() per se.  I propose that this can be resolved with changes
>> at the next higher level.  Let's make outToken print NULL as <> as
>> it always has, but print an empty string as "" (two double quotes).
>> If the raw input string is two double quotes, print it as \"" to
>> disambiguate.  This'd require a catversion bump when committed,
>> but I don't think there are any showstopper problems otherwise.

> Makes sense to me.

Here is a version of all-but-the-last patch in Peter's series.
I left off the last one because it fails check-world: we now
get through the core regression tests okay, but then the pg_dump
tests fail on the new SQL function.  To fix that, we would have
to extend ruleutils.c's get_utility_query_def() to be able to
fully reconstruct any legal utility query ... which seems like
a pretty dauntingly large amount of tedious manual effort to
start with, and then also a nontrivial additional requirement
on any future patch that adds new utility syntax.  Are we sure
it's worth going there?

But I think it's probably worth committing what we have here
just on testability grounds.

Some notes:

0001, 0002 not changed.

I tweaked 0003 a bit, mainly because I think it's probably not very
safe to apply strncmp to a string we don't know the length of.
It might be difficult to fall off the end of memory that way, but
I wouldn't bet it's impossible.  Also, adding the length checks gets
rid of the need for a grotty order dependency in _readA_Expr().

0004 fixes the empty-string problem as per above.

I did not like what you'd done about imprecise floats one bit.
I think we ought to do it as in 0005 instead: drop all the
hard-wired precision assumptions and just print per Ryu.

0006, 0007, 0008 are basically the same as your previous 0004,
0005, 0006, except for getting rid of the float hacking in 0005.

If you're good with this approach to the float issue, I think
this set is committable (minus 0006 of course, and don't forget
the catversion bump).

            regards, tom lane

From 89def573ced57fc320ad191613dac62c8992c27a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Fri, 12 Aug 2022 21:15:40 +0200
Subject: [PATCH v2 1/9] Fix reading of most-negative integer value nodes

The main parser checks whether a literal fits into an int when
deciding whether it should be put into an Integer or Float node.  The
parser processes integer literals without signs.  So a most-negative
integer literal will not fit into Integer and will end up as a Float
node.

The node tokenizer did this differently.  It included the sign when
checking whether the literal fit into int.  So a most-negative integer
would indeed fit that way and end up as an Integer node.

In order to preserve the node structure correctly, we need the node
tokenizer to also analyze integer literals without sign.

There are a number of test cases in the regression tests that have a
most-negative integer argument of some utility statement, so this
issue is easily reproduced under WRITE_READ_PARSE_PLAN_TREES.
---
 src/backend/nodes/read.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/nodes/read.c b/src/backend/nodes/read.c
index 4a54996b63..a9cb81b129 100644
--- a/src/backend/nodes/read.c
+++ b/src/backend/nodes/read.c
@@ -267,7 +267,7 @@ nodeTokenType(const char *token, int length)
         char       *endptr;

         errno = 0;
-        (void) strtoint(token, &endptr, 10);
+        (void) strtoint(numptr, &endptr, 10);
         if (endptr != token + length || errno == ERANGE)
             return T_Float;
         return T_Integer;
--
2.37.1

From 8a469d0a7195be4ecc65155b82c5836dfb13b920 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Fri, 12 Aug 2022 21:16:26 +0200
Subject: [PATCH v2 2/9] Fix reading of BitString nodes

The node tokenizer went out of its way to store BitString node values
without the leading 'b'.  But everything else in the system stores the
leading 'b'.  This would break if a BitString node is
read-printed-read.

Also, the node tokenizer didn't know that BitString node tokens could
also start with 'x'.
---
 src/backend/nodes/read.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/src/backend/nodes/read.c b/src/backend/nodes/read.c
index a9cb81b129..fe84f140ee 100644
--- a/src/backend/nodes/read.c
+++ b/src/backend/nodes/read.c
@@ -288,7 +288,7 @@ nodeTokenType(const char *token, int length)
         retval = T_Boolean;
     else if (*token == '"' && length > 1 && token[length - 1] == '"')
         retval = T_String;
-    else if (*token == 'b')
+    else if (*token == 'b' || *token == 'x')
         retval = T_BitString;
     else
         retval = OTHER_TOKEN;
@@ -471,11 +471,10 @@ nodeRead(const char *token, int tok_len)
             break;
         case T_BitString:
             {
-                char       *val = palloc(tok_len);
+                char       *val = palloc(tok_len + 1);

-                /* skip leading 'b' */
-                memcpy(val, token + 1, tok_len - 1);
-                val[tok_len - 1] = '\0';
+                memcpy(val, token, tok_len);
+                val[tok_len] = '\0';
                 result = (Node *) makeBitString(val);
                 break;
             }
--
2.37.1

From b0e5a42805f6e72f2d9c6a8ef693539027fe9c64 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 22 Sep 2022 15:08:20 -0400
Subject: [PATCH v2 3/9] Add read support for some missing raw parse nodes

The node types A_Const, Constraint, and A_Expr had custom output
functions, but no read functions were implemented so far.

The A_Expr output format had to be tweaked a bit to make it easier to
parse.

Be a bit more cautious about applying strncmp to unterminated strings.

Also error out if an unrecognized enum value is found in each case,
instead of just printing a placeholder value.  That was maybe ok for
debugging but won't work if we want to have robust round-tripping.

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 60610e3a4b..24ea0487e7 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -548,12 +548,12 @@ _outA_Expr(StringInfo str, const A_Expr *node)
             WRITE_NODE_FIELD(name);
             break;
         case AEXPR_OP_ANY:
-            WRITE_NODE_FIELD(name);
             appendStringInfoString(str, " ANY");
+            WRITE_NODE_FIELD(name);
             break;
         case AEXPR_OP_ALL:
-            WRITE_NODE_FIELD(name);
             appendStringInfoString(str, " ALL");
+            WRITE_NODE_FIELD(name);
             break;
         case AEXPR_DISTINCT:
             appendStringInfoString(str, " DISTINCT");
@@ -600,7 +600,7 @@ _outA_Expr(StringInfo str, const A_Expr *node)
             WRITE_NODE_FIELD(name);
             break;
         default:
-            appendStringInfoString(str, " ??");
+            elog(ERROR, "unrecognized A_Expr_Kind: %d", (int) node->kind);
             break;
     }
 
@@ -782,8 +782,7 @@ _outConstraint(StringInfo str, const Constraint *node)
             break;
 
         default:
-            appendStringInfo(str, "<unrecognized_constraint %d>",
-                             (int) node->contype);
+            elog(ERROR, "unrecognized ConstrType: %d", (int) node->contype);
             break;
     }
 }
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index bee62fc15c..89a9a50f7c 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -270,11 +270,11 @@ _readBoolExpr(void)
     /* do-it-yourself enum representation */
     token = pg_strtok(&length); /* skip :boolop */
     token = pg_strtok(&length); /* get field value */
-    if (strncmp(token, "and", 3) == 0)
+    if (length == 3 && strncmp(token, "and", 3) == 0)
         local_node->boolop = AND_EXPR;
-    else if (strncmp(token, "or", 2) == 0)
+    else if (length == 2 && strncmp(token, "or", 2) == 0)
         local_node->boolop = OR_EXPR;
-    else if (strncmp(token, "not", 3) == 0)
+    else if (length == 3 && strncmp(token, "not", 3) == 0)
         local_node->boolop = NOT_EXPR;
     else
         elog(ERROR, "unrecognized boolop \"%.*s\"", length, token);
@@ -285,6 +285,162 @@ _readBoolExpr(void)
     READ_DONE();
 }
 
+static A_Const *
+_readA_Const(void)
+{
+    READ_LOCALS(A_Const);
+
+    token = pg_strtok(&length);
+    if (length == 4 && strncmp(token, "NULL", 4) == 0)
+        local_node->isnull = true;
+    else
+    {
+        union ValUnion *tmp = nodeRead(NULL, 0);
+
+        memcpy(&local_node->val, tmp, sizeof(*tmp));
+    }
+
+    READ_LOCATION_FIELD(location);
+
+    READ_DONE();
+}
+
+/*
+ * _readConstraint
+ */
+static Constraint *
+_readConstraint(void)
+{
+    READ_LOCALS(Constraint);
+
+    READ_STRING_FIELD(conname);
+    READ_BOOL_FIELD(deferrable);
+    READ_BOOL_FIELD(initdeferred);
+    READ_LOCATION_FIELD(location);
+
+    token = pg_strtok(&length); /* skip :contype */
+    token = pg_strtok(&length); /* get field value */
+    if (length == 4 && strncmp(token, "NULL", 4) == 0)
+        local_node->contype = CONSTR_NULL;
+    else if (length == 8 && strncmp(token, "NOT_NULL", 8) == 0)
+        local_node->contype = CONSTR_NOTNULL;
+    else if (length == 7 && strncmp(token, "DEFAULT", 7) == 0)
+        local_node->contype = CONSTR_DEFAULT;
+    else if (length == 8 && strncmp(token, "IDENTITY", 8) == 0)
+        local_node->contype = CONSTR_IDENTITY;
+    else if (length == 9 && strncmp(token, "GENERATED", 9) == 0)
+        local_node->contype = CONSTR_GENERATED;
+    else if (length == 5 && strncmp(token, "CHECK", 5) == 0)
+        local_node->contype = CONSTR_CHECK;
+    else if (length == 11 && strncmp(token, "PRIMARY_KEY", 11) == 0)
+        local_node->contype = CONSTR_PRIMARY;
+    else if (length == 6 && strncmp(token, "UNIQUE", 6) == 0)
+        local_node->contype = CONSTR_UNIQUE;
+    else if (length == 9 && strncmp(token, "EXCLUSION", 9) == 0)
+        local_node->contype = CONSTR_EXCLUSION;
+    else if (length == 11 && strncmp(token, "FOREIGN_KEY", 11) == 0)
+        local_node->contype = CONSTR_FOREIGN;
+    else if (length == 15 && strncmp(token, "ATTR_DEFERRABLE", 15) == 0)
+        local_node->contype = CONSTR_ATTR_DEFERRABLE;
+    else if (length == 19 && strncmp(token, "ATTR_NOT_DEFERRABLE", 19) == 0)
+        local_node->contype = CONSTR_ATTR_NOT_DEFERRABLE;
+    else if (length == 13 && strncmp(token, "ATTR_DEFERRED", 13) == 0)
+        local_node->contype = CONSTR_ATTR_DEFERRED;
+    else if (length == 14 && strncmp(token, "ATTR_IMMEDIATE", 14) == 0)
+        local_node->contype = CONSTR_ATTR_IMMEDIATE;
+
+    switch (local_node->contype)
+    {
+        case CONSTR_NULL:
+        case CONSTR_NOTNULL:
+            /* no extra fields */
+            break;
+
+        case CONSTR_DEFAULT:
+            READ_NODE_FIELD(raw_expr);
+            READ_STRING_FIELD(cooked_expr);
+            break;
+
+        case CONSTR_IDENTITY:
+            READ_NODE_FIELD(options);
+            READ_CHAR_FIELD(generated_when);
+            break;
+
+        case CONSTR_GENERATED:
+            READ_NODE_FIELD(raw_expr);
+            READ_STRING_FIELD(cooked_expr);
+            READ_CHAR_FIELD(generated_when);
+            break;
+
+        case CONSTR_CHECK:
+            READ_BOOL_FIELD(is_no_inherit);
+            READ_NODE_FIELD(raw_expr);
+            READ_STRING_FIELD(cooked_expr);
+            READ_BOOL_FIELD(skip_validation);
+            READ_BOOL_FIELD(initially_valid);
+            break;
+
+        case CONSTR_PRIMARY:
+            READ_NODE_FIELD(keys);
+            READ_NODE_FIELD(including);
+            READ_NODE_FIELD(options);
+            READ_STRING_FIELD(indexname);
+            READ_STRING_FIELD(indexspace);
+            READ_BOOL_FIELD(reset_default_tblspc);
+            /* access_method and where_clause not currently used */
+            break;
+
+        case CONSTR_UNIQUE:
+            READ_BOOL_FIELD(nulls_not_distinct);
+            READ_NODE_FIELD(keys);
+            READ_NODE_FIELD(including);
+            READ_NODE_FIELD(options);
+            READ_STRING_FIELD(indexname);
+            READ_STRING_FIELD(indexspace);
+            READ_BOOL_FIELD(reset_default_tblspc);
+            /* access_method and where_clause not currently used */
+            break;
+
+        case CONSTR_EXCLUSION:
+            READ_NODE_FIELD(exclusions);
+            READ_NODE_FIELD(including);
+            READ_NODE_FIELD(options);
+            READ_STRING_FIELD(indexname);
+            READ_STRING_FIELD(indexspace);
+            READ_BOOL_FIELD(reset_default_tblspc);
+            READ_STRING_FIELD(access_method);
+            READ_NODE_FIELD(where_clause);
+            break;
+
+        case CONSTR_FOREIGN:
+            READ_NODE_FIELD(pktable);
+            READ_NODE_FIELD(fk_attrs);
+            READ_NODE_FIELD(pk_attrs);
+            READ_CHAR_FIELD(fk_matchtype);
+            READ_CHAR_FIELD(fk_upd_action);
+            READ_CHAR_FIELD(fk_del_action);
+            READ_NODE_FIELD(fk_del_set_cols);
+            READ_NODE_FIELD(old_conpfeqop);
+            READ_OID_FIELD(old_pktable_oid);
+            READ_BOOL_FIELD(skip_validation);
+            READ_BOOL_FIELD(initially_valid);
+            break;
+
+        case CONSTR_ATTR_DEFERRABLE:
+        case CONSTR_ATTR_NOT_DEFERRABLE:
+        case CONSTR_ATTR_DEFERRED:
+        case CONSTR_ATTR_IMMEDIATE:
+            /* no extra fields */
+            break;
+
+        default:
+            elog(ERROR, "unrecognized ConstrType: %d", (int) local_node->contype);
+            break;
+    }
+
+    READ_DONE();
+}
+
 static RangeTblEntry *
 _readRangeTblEntry(void)
 {
@@ -376,6 +532,93 @@ _readRangeTblEntry(void)
     READ_DONE();
 }
 
+static A_Expr *
+_readA_Expr(void)
+{
+    READ_LOCALS(A_Expr);
+
+    token = pg_strtok(&length);
+
+    if (length == 3 && strncmp(token, "ANY", 3) == 0)
+    {
+        local_node->kind = AEXPR_OP_ANY;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 3 && strncmp(token, "ALL", 3) == 0)
+    {
+        local_node->kind = AEXPR_OP_ALL;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 8 && strncmp(token, "DISTINCT", 8) == 0)
+    {
+        local_node->kind = AEXPR_DISTINCT;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 12 && strncmp(token, "NOT_DISTINCT", 12) == 0)
+    {
+        local_node->kind = AEXPR_NOT_DISTINCT;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 6 && strncmp(token, "NULLIF", 6) == 0)
+    {
+        local_node->kind = AEXPR_NULLIF;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 2 && strncmp(token, "IN", 2) == 0)
+    {
+        local_node->kind = AEXPR_IN;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 4 && strncmp(token, "LIKE", 4) == 0)
+    {
+        local_node->kind = AEXPR_LIKE;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 5 && strncmp(token, "ILIKE", 5) == 0)
+    {
+        local_node->kind = AEXPR_ILIKE;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 7 && strncmp(token, "SIMILAR", 7) == 0)
+    {
+        local_node->kind = AEXPR_SIMILAR;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 7 && strncmp(token, "BETWEEN", 7) == 0)
+    {
+        local_node->kind = AEXPR_BETWEEN;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 11 && strncmp(token, "NOT_BETWEEN", 11) == 0)
+    {
+        local_node->kind = AEXPR_NOT_BETWEEN;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 11 && strncmp(token, "BETWEEN_SYM", 11) == 0)
+    {
+        local_node->kind = AEXPR_BETWEEN_SYM;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 15 && strncmp(token, "NOT_BETWEEN_SYM", 15) == 0)
+    {
+        local_node->kind = AEXPR_NOT_BETWEEN_SYM;
+        READ_NODE_FIELD(name);
+    }
+    else if (length == 5 && strncmp(token, ":name", 5) == 0)
+    {
+        local_node->kind = AEXPR_OP;
+        local_node->name = nodeRead(NULL, 0);
+    }
+    else
+        elog(ERROR, "unrecognized A_Expr kind: \"%.*s\"", length, token);
+
+    READ_NODE_FIELD(lexpr);
+    READ_NODE_FIELD(rexpr);
+    READ_LOCATION_FIELD(location);
+
+    READ_DONE();
+}
+
 static ExtensibleNode *
 _readExtensibleNode(void)
 {
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6958306a7d..aead2afd6e 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -291,7 +291,7 @@ typedef enum A_Expr_Kind
 
 typedef struct A_Expr
 {
-    pg_node_attr(custom_read_write, no_read)
+    pg_node_attr(custom_read_write)
 
     NodeTag        type;
     A_Expr_Kind kind;            /* see above */
@@ -319,7 +319,7 @@ union ValUnion
 
 typedef struct A_Const
 {
-    pg_node_attr(custom_copy_equal, custom_read_write, no_read)
+    pg_node_attr(custom_copy_equal, custom_read_write)
 
     NodeTag        type;
     union ValUnion val;
@@ -2332,7 +2332,7 @@ typedef enum ConstrType            /* types of constraints */
 
 typedef struct Constraint
 {
-    pg_node_attr(custom_read_write, no_read)
+    pg_node_attr(custom_read_write)
 
     NodeTag        type;
     ConstrType    contype;        /* see above */
From c54032ec63334e260fc5075dfd29e97c2e739a45 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 22 Sep 2022 15:15:20 -0400
Subject: [PATCH v2 4/9] Fix write/read of empty string fields in Nodes.

Historically, outToken has represented both NULL and empty-string
strings as "<>", which readfuncs.c then read as NULL, thus failing
to preserve empty-string fields accurately.  Remarkably, this has
not caused any serious problems yet, but let's fix it.

We'll keep the "<>" notation for NULL, and use """" for empty string,
because that matches other notational choices already in use.
An actual input string of """" is converted to "\""" (this was true
already, apparently as a hangover from an ancient time when string
quoting was handled directly by pg_strtok).

CHAR fields also use "<>", but for '\0'.

diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 24ea0487e7..63dda75ae5 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -135,16 +135,23 @@ static void outChar(StringInfo str, char c);
  *      Convert an ordinary string (eg, an identifier) into a form that
  *      will be decoded back to a plain token by read.c's functions.
  *
- *      If a null or empty string is given, it is encoded as "<>".
+ *      If a null string pointer is given, it is encoded as '<>'.
+ *      An empty string is encoded as '""'.  To avoid ambiguity, input
+ *      strings beginning with '<' or '"' receive a leading backslash.
  */
 void
 outToken(StringInfo str, const char *s)
 {
-    if (s == NULL || *s == '\0')
+    if (s == NULL)
     {
         appendStringInfoString(str, "<>");
         return;
     }
+    if (*s == '\0')
+    {
+        appendStringInfoString(str, "\"\"");
+        return;
+    }
 
     /*
      * Look for characters or patterns that are treated specially by read.c
@@ -178,6 +185,13 @@ outChar(StringInfo str, char c)
 {
     char        in[2];
 
+    /* Traditionally, we've represented \0 as <>, so keep doing that */
+    if (c == '\0')
+    {
+        appendStringInfoString(str, "<>");
+        return;
+    }
+
     in[0] = c;
     in[1] = '\0';
 
@@ -636,7 +650,8 @@ _outString(StringInfo str, const String *node)
 {
     /*
      * We use outToken to provide escaping of the string's content, but we
-     * don't want it to do anything with an empty string.
+     * don't want it to convert an empty string to '""', because we're putting
+     * double quotes around the string already.
      */
     appendStringInfoChar(str, '"');
     if (node->sval[0] != '\0')
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 296104b175..251dff4242 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -178,8 +178,18 @@
 
 #define strtobool(x)  ((*(x) == 't') ? true : false)
 
-#define nullable_string(token,length)  \
-    ((length) == 0 ? NULL : debackslash(token, length))
+static char *
+nullable_string(const char *token, int length)
+{
+    /* outToken emits <> for NULL, and pg_strtok makes that an empty string */
+    if (length == 0)
+        return NULL;
+    /* outToken emits "" for empty string */
+    if (length == 2 && token[0] == '"' && token[1] == '"')
+        return pstrdup("");
+    /* otherwise, we must remove protective backslashes added by outToken */
+    return debackslash(token, length);
+}
 
 
 /*
From 0c39b0f5ad26ae55649a3da12f55938d9adcee2b Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 22 Sep 2022 15:44:18 -0400
Subject: [PATCH v2 5/9] Don't lose precision for float fields of Nodes.

Historically we've been more worried about making the output of
float fields look pretty than whether they'd be read back exactly.
That won't work if we're to compare the read-back nodes for
equality, so switch to using the Ryu code for float output.

diff --git a/src/backend/nodes/gen_node_support.pl b/src/backend/nodes/gen_node_support.pl
index b707a09f56..81b8c184a9 100644
--- a/src/backend/nodes/gen_node_support.pl
+++ b/src/backend/nodes/gen_node_support.pl
@@ -983,29 +983,29 @@ _read${n}(void)
         }
         elsif ($t eq 'double')
         {
-            print $off "\tWRITE_FLOAT_FIELD($f, \"%.6f\");\n";
+            print $off "\tWRITE_FLOAT_FIELD($f);\n";
             print $rff "\tREAD_FLOAT_FIELD($f);\n" unless $no_read;
         }
         elsif ($t eq 'Cardinality')
         {
-            print $off "\tWRITE_FLOAT_FIELD($f, \"%.0f\");\n";
+            print $off "\tWRITE_FLOAT_FIELD($f);\n";
             print $rff "\tREAD_FLOAT_FIELD($f);\n" unless $no_read;
         }
         elsif ($t eq 'Cost')
         {
-            print $off "\tWRITE_FLOAT_FIELD($f, \"%.2f\");\n";
+            print $off "\tWRITE_FLOAT_FIELD($f);\n";
             print $rff "\tREAD_FLOAT_FIELD($f);\n" unless $no_read;
         }
         elsif ($t eq 'QualCost')
         {
-            print $off "\tWRITE_FLOAT_FIELD($f.startup, \"%.2f\");\n";
-            print $off "\tWRITE_FLOAT_FIELD($f.per_tuple, \"%.2f\");\n";
+            print $off "\tWRITE_FLOAT_FIELD($f.startup);\n";
+            print $off "\tWRITE_FLOAT_FIELD($f.per_tuple);\n";
             print $rff "\tREAD_FLOAT_FIELD($f.startup);\n"   unless $no_read;
             print $rff "\tREAD_FLOAT_FIELD($f.per_tuple);\n" unless $no_read;
         }
         elsif ($t eq 'Selectivity')
         {
-            print $off "\tWRITE_FLOAT_FIELD($f, \"%.4f\");\n";
+            print $off "\tWRITE_FLOAT_FIELD($f);\n";
             print $rff "\tREAD_FLOAT_FIELD($f);\n" unless $no_read;
         }
         elsif ($t eq 'char*')
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 63dda75ae5..64c65f060b 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -17,6 +17,7 @@
 #include <ctype.h>

 #include "access/attnum.h"
+#include "common/shortest_dec.h"
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/bitmapset.h"
@@ -25,6 +26,7 @@
 #include "utils/datum.h"

 static void outChar(StringInfo str, char c);
+static void outDouble(StringInfo str, double d);


 /*
@@ -69,9 +71,10 @@ static void outChar(StringInfo str, char c);
     appendStringInfo(str, " :" CppAsString(fldname) " %d", \
                      (int) node->fldname)

-/* Write a float field --- caller must give format to define precision */
-#define WRITE_FLOAT_FIELD(fldname,format) \
-    appendStringInfo(str, " :" CppAsString(fldname) " " format, node->fldname)
+/* Write a float field (actually, they're double) */
+#define WRITE_FLOAT_FIELD(fldname) \
+    (appendStringInfo(str, " :" CppAsString(fldname) " "), \
+     outDouble(str, node->fldname))

 /* Write a boolean field */
 #define WRITE_BOOL_FIELD(fldname) \
@@ -198,6 +201,18 @@ outChar(StringInfo str, char c)
     outToken(str, in);
 }

+/*
+ * Convert a double value, attempting to ensure the value is preserved exactly.
+ */
+static void
+outDouble(StringInfo str, double d)
+{
+    char        buf[DOUBLE_SHORTEST_DECIMAL_LEN];
+
+    double_to_shortest_decimal_buf(d, buf);
+    appendStringInfoString(str, buf);
+}
+
 /*
  * common implementation for scalar-array-writing functions
  *
@@ -525,7 +540,7 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
             break;
         case RTE_NAMEDTUPLESTORE:
             WRITE_STRING_FIELD(enrname);
-            WRITE_FLOAT_FIELD(enrtuples, "%.0f");
+            WRITE_FLOAT_FIELD(enrtuples);
             WRITE_OID_FIELD(relid);
             WRITE_NODE_FIELD(coltypes);
             WRITE_NODE_FIELD(coltypmods);
From 6479e315f700bc65aa8cbe66de07c8fdffd0d8eb Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 10 Aug 2022 23:37:31 +0200
Subject: [PATCH v2 6/9] XXX: Turn on WRITE_READ_PARSE_PLAN_TREES for testing

Don't commit this one!
---
 src/include/pg_config_manual.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/include/pg_config_manual.h b/src/include/pg_config_manual.h
index f2a106f983..f85a7a7312 100644
--- a/src/include/pg_config_manual.h
+++ b/src/include/pg_config_manual.h
@@ -334,7 +334,7 @@
  * outfuncs.c/readfuncs.c, to facilitate catching errors and omissions in
  * those modules.
  */
-/* #define WRITE_READ_PARSE_PLAN_TREES */
+#define WRITE_READ_PARSE_PLAN_TREES

 /*
  * Define this to force all raw parse trees for DML statements to be scanned
--
2.37.1

From c9aa70f126477ae5c17a74a5c7b32133dc5f7fb3 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 22 Sep 2022 15:59:28 -0400
Subject: [PATCH v2 7/9] Implement WRITE_READ_PARSE_PLAN_TREES for raw parse trees


diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 35eff28bd3..4952d01183 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -603,10 +603,22 @@ pg_parse_query(const char *query_string)
 #endif

     /*
-     * Currently, outfuncs/readfuncs support is missing for many raw parse
-     * tree nodes, so we don't try to implement WRITE_READ_PARSE_PLAN_TREES
-     * here.
+     * Optional debugging check: pass raw parsetrees through
+     * outfuncs/readfuncs
      */
+#ifdef WRITE_READ_PARSE_PLAN_TREES
+    {
+        char       *str = nodeToString(raw_parsetree_list);
+        List       *new_list = stringToNodeWithLocations(str);
+
+        pfree(str);
+        /* This checks both outfuncs/readfuncs and the equal() routines... */
+        if (!equal(new_list, raw_parsetree_list))
+            elog(WARNING, "outfuncs/readfuncs failed to produce an equal raw parse tree");
+        else
+            raw_parsetree_list = new_list;
+    }
+#endif

     TRACE_POSTGRESQL_QUERY_PARSE_DONE(query_string);

From 92a720e9fbeff96e17185858bbb811f654ecc80f Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 22 Sep 2022 16:05:32 -0400
Subject: [PATCH v2 8/9] Enable WRITE_READ_PARSE_PLAN_TREES of rewritten utility statements

This was previously disabled because we lacked outfuncs/readfuncs
support for most utility statement types.

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 4952d01183..5352d5f4c6 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -801,7 +801,7 @@ pg_rewrite_query(Query *query)
         new_list = copyObject(querytree_list);
         /* This checks both copyObject() and the equal() routines... */
         if (!equal(new_list, querytree_list))
-            elog(WARNING, "copyObject() failed to produce equal parse tree");
+            elog(WARNING, "copyObject() failed to produce an equal rewritten parse tree");
         else
             querytree_list = new_list;
     }
@@ -813,35 +813,25 @@ pg_rewrite_query(Query *query)
         List       *new_list = NIL;
         ListCell   *lc;

-        /*
-         * We currently lack outfuncs/readfuncs support for most utility
-         * statement types, so only attempt to write/read non-utility queries.
-         */
         foreach(lc, querytree_list)
         {
             Query       *query = lfirst_node(Query, lc);
+            char       *str = nodeToString(query);
+            Query       *new_query = stringToNodeWithLocations(str);

-            if (query->commandType != CMD_UTILITY)
-            {
-                char       *str = nodeToString(query);
-                Query       *new_query = stringToNodeWithLocations(str);
-
-                /*
-                 * queryId is not saved in stored rules, but we must preserve
-                 * it here to avoid breaking pg_stat_statements.
-                 */
-                new_query->queryId = query->queryId;
+            /*
+             * queryId is not saved in stored rules, but we must preserve it
+             * here to avoid breaking pg_stat_statements.
+             */
+            new_query->queryId = query->queryId;

-                new_list = lappend(new_list, new_query);
-                pfree(str);
-            }
-            else
-                new_list = lappend(new_list, query);
+            new_list = lappend(new_list, new_query);
+            pfree(str);
         }

         /* This checks both outfuncs/readfuncs and the equal() routines... */
         if (!equal(new_list, querytree_list))
-            elog(WARNING, "outfuncs/readfuncs failed to produce equal parse tree");
+            elog(WARNING, "outfuncs/readfuncs failed to produce an equal rewritten parse tree");
         else
             querytree_list = new_list;
     }

Re: Extending outfuncs support to utility statements

От
Tom Lane
Дата:
I wrote:
> I left off the last one because it fails check-world: we now
> get through the core regression tests okay, but then the pg_dump
> tests fail on the new SQL function.  To fix that, we would have
> to extend ruleutils.c's get_utility_query_def() to be able to
> fully reconstruct any legal utility query ... which seems like
> a pretty dauntingly large amount of tedious manual effort to
> start with, and then also a nontrivial additional requirement
> on any future patch that adds new utility syntax.  Are we sure
> it's worth going there?

Thinking about that some more, I wondered if we'd even wish to
build such code, compared to just saving the original source text
for utility statements and printing that.  Obviously, this loses
all the benefits of new-style SQL functions compared to old-style
... except that those benefits would be illusory anyway, since by
definition we have not done parse analysis on a utility statement.
So we *cannot* offer any useful guarantees about being search_path
change proof, following renames of referenced objects, preventing
drops of referenced objects, etc etc.

This makes me wonder if this is a feature we even want.  If we
put it in, we'd have to add a bunch of disclaimers about how
utility statements behave entirely differently from DML statements.

Perhaps an interesting alternative is to allow a command along
the lines of

    EXECUTE string-expression

(of course that name is already taken) where we'd parse-analyze
the string-expression at function creation, but then the computed
string is executed as a SQL command in the runtime environment.
This would make it fairly clear which things you have guarantees
of and which you don't.  It'd also offer a feature that the PLs
have but SQL functions traditionally haven't, ie execution of
dynamically-computed SQL.

Anyway, this is a bit far afield from the stated topic of this
thread.  I think we should commit something approximately like
what I posted and then start a new thread specifically about
what we'd like to do about utility commands in new-style SQL
functions.

            regards, tom lane



Re: Extending outfuncs support to utility statements

От
Peter Eisentraut
Дата:
On 22.09.22 23:21, Tom Lane wrote:
> Anyway, this is a bit far afield from the stated topic of this
> thread.  I think we should commit something approximately like
> what I posted and then start a new thread specifically about
> what we'd like to do about utility commands in new-style SQL
> functions.

Right, I have committed everything and will close the CF entry.  I don't 
have a specific idea about how to move forward right now.





Re: Extending outfuncs support to utility statements

От
Alexander Lakhin
Дата:
Hello,

26.09.2022 17:46, Peter Eisentraut wrote:
On 22.09.22 23:21, Tom Lane wrote:
Anyway, this is a bit far afield from the stated topic of this
thread.  I think we should commit something approximately like
what I posted and then start a new thread specifically about
what we'd like to do about utility commands in new-style SQL
functions.

Right, I have committed everything and will close the CF entry.  I don't have a specific idea about how to move forward right now.

Please look at the function _readA_Const() (introduced in a6bc33019), which fails on current master under valgrind:
CPPFLAGS="-DUSE_VALGRIND -DWRITE_READ_PARSE_PLAN_TREES -Og " ./configure -q --enable-debug && make -s -j8 && make check

============== creating temporary instance            ==============
============== initializing database system           ==============

pg_regress: initdb failed
Examine .../src/test/regress/log/initdb.log for the reason.

initdb.log contains:
performing post-bootstrap initialization ... ==00:00:00:02.155 3419654== Invalid read of size 16
==00:00:00:02.155 3419654==    at 0x448691: memcpy (string_fortified.h:29)
==00:00:00:02.155 3419654==    by 0x448691: _readA_Const (readfuncs.c:315)
==00:00:00:02.155 3419654==    by 0x44CCD2: parseNodeString (readfuncs.switch.c:129)
==00:00:00:02.155 3419654==    by 0x4348D6: nodeRead (read.c:338)
==00:00:00:02.155 3419654==    by 0x434879: nodeRead (read.c:452)
==00:00:00:02.155 3419654==    by 0x440E6C: _readTypeName (readfuncs.funcs.c:830)
==00:00:00:02.155 3419654==    by 0x44CC3A: parseNodeString (readfuncs.switch.c:121)
==00:00:00:02.155 3419654==    by 0x4348D6: nodeRead (read.c:338)
==00:00:00:02.155 3419654==    by 0x43D51D: _readFunctionParameter (readfuncs.funcs.c:2513)
==00:00:00:02.155 3419654==    by 0x44DE0C: parseNodeString (readfuncs.switch.c:367)
==00:00:00:02.155 3419654==    by 0x4348D6: nodeRead (read.c:338)
==00:00:00:02.155 3419654==    by 0x434879: nodeRead (read.c:452)
==00:00:00:02.155 3419654==    by 0x438A9C: _readCreateFunctionStmt (readfuncs.funcs.c:2499)
==00:00:00:02.155 3419654==  Address 0xf12f718 is 0 bytes inside a block of size 8 client-defined
==00:00:00:02.155 3419654==    at 0x6A70C3: MemoryContextAllocZeroAligned (mcxt.c:1109)
==00:00:00:02.155 3419654==    by 0x450C31: makeInteger (value.c:25)
==00:00:00:02.155 3419654==    by 0x434D59: nodeRead (read.c:482)
==00:00:00:02.155 3419654==    by 0x448690: _readA_Const (readfuncs.c:313)
==00:00:00:02.155 3419654==    by 0x44CCD2: parseNodeString (readfuncs.switch.c:129)
==00:00:00:02.155 3419654==    by 0x4348D6: nodeRead (read.c:338)
==00:00:00:02.155 3419654==    by 0x434879: nodeRead (read.c:452)
==00:00:00:02.155 3419654==    by 0x440E6C: _readTypeName (readfuncs.funcs.c:830)
==00:00:00:02.155 3419654==    by 0x44CC3A: parseNodeString (readfuncs.switch.c:121)
==00:00:00:02.155 3419654==    by 0x4348D6: nodeRead (read.c:338)
==00:00:00:02.155 3419654==    by 0x43D51D: _readFunctionParameter (readfuncs.funcs.c:2513)
==00:00:00:02.155 3419654==    by 0x44DE0C: parseNodeString (readfuncs.switch.c:367)
==00:00:00:02.155 3419654==

Here _readA_Const() performs:
                union ValUnion *tmp = nodeRead(NULL, 0);

                memcpy(&local_node->val, tmp, sizeof(*tmp));

where sizeof(union ValUnion) = 16, but nodeRead()->makeInteger() produced Integer (sizeof(Integer) = 8).

Best regards,
Alexander

Re: Extending outfuncs support to utility statements

От
Tom Lane
Дата:
Alexander Lakhin <exclusion@gmail.com> writes:
> Please look at the function _readA_Const() (introduced in a6bc33019), which fails on current master under valgrind:
> ...
> Here _readA_Const() performs:
>                  union ValUnion *tmp = nodeRead(NULL, 0);

>                  memcpy(&local_node->val, tmp, sizeof(*tmp));

> where sizeof(union ValUnion) = 16, but nodeRead()->makeInteger() produced Integer (sizeof(Integer) = 8).

Right, so we can't get away without a switch-on-value-type like the
other functions for A_Const have.  Will fix.

            regards, tom lane