Обсуждение: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

Поиск

Список

Период

Сортировка

misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

jian he

Дата:

21 августа, 14:59:14

hi.

while reviewing other work, some error messages in src/backend/tcop/utility.c
seem not accurate.

static void
ProcessUtilitySlow(ParseState *pstate,
                   PlannedStmt *pstmt,
                   const char *queryString,
                   ProcessUtilityContext context,
                   ParamListInfo params,
                   QueryEnvironment *queryEnv,
                   DestReceiver *dest,
                   QueryCompletion *qc)

            case T_CreateStatsStmt:
                {
                    Oid            relid;
                    CreateStatsStmt *stmt = (CreateStatsStmt *) parsetree;
                    RangeVar   *rel = (RangeVar *) linitial(stmt->relations);

                    if (!IsA(rel, RangeVar))
                        ereport(ERROR,
                                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                                 errmsg("only a single relation is
allowed in CREATE STATISTICS")));
                    relid = RangeVarGetRelid(rel,
ShareUpdateExclusiveLock, false);
                    /* Run parse analysis ... */
                    stmt = transformStatsStmt(relid, stmt, queryString);
                    address = CreateStatistics(stmt);
                }


for example:

create or replace function tftest(int) returns table(a int, b int) as $$
begin
  return query select $1, $1+i from generate_series(1,5) g(i);
end;
$$ language plpgsql immutable strict;

CREATE STATISTICS alt_stat2 ON a, b FROM tftest(1);
ERROR:  only a single relation is allowed in CREATE STATISTICS

this error message seem misleading?
also this error check seems duplicated in CreateStatistics?

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Kirill Reshke

Дата:

21 августа, 18:20:52

On Thu, 21 Aug 2025 at 17:00, jian he <jian.universality@gmail.com> wrote:
>
> hi.

Hi!

>   RangeVar   *rel = (RangeVar *) linitial(stmt->relations);
>   if (!IsA(rel, RangeVar))

These two lines are weird. Looks like  linitial(stmt->relations)
should be assigned to variable with type Node* first?

>
> for example:
>
> create or replace function tftest(int) returns table(a int, b int) as $$
> begin
>   return query select $1, $1+i from generate_series(1,5) g(i);
> end;
> $$ language plpgsql immutable strict;
>
> CREATE STATISTICS alt_stat2 ON a, b FROM tftest(1);
> ERROR:  only a single relation is allowed in CREATE STATISTICS
>
> this error message seem misleading?

I wouldn’t say this is misleading, but " a single relation" is indeed
not precise enough. IMO we need a more precise term to distinguish
regular relation and table func.

> also this error check seems duplicated in CreateStatistics?
>

Indeed.

--
Best regards,
Kirill Reshke

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Tom Lane

Дата:

21 августа, 18:27:59

Kirill Reshke <reshkekirill@gmail.com> writes:
> On Thu, 21 Aug 2025 at 17:00, jian he <jian.universality@gmail.com> wrote:
>> RangeVar   *rel = (RangeVar *) linitial(stmt->relations);
>> if (!IsA(rel, RangeVar))

> These two lines are weird. Looks like  linitial(stmt->relations)
> should be assigned to variable with type Node* first?

We take that sort of shortcut in many places.  If there's not any need
for the code to deal with more than one node type, an extra variable
that's used just for the IsA test seems like a lot of notational
overhead.

            regards, tom lane

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Álvaro Herrera

Дата:

22 августа, 12:46:45

On 2025-Aug-21, Kirill Reshke wrote:

> I wouldn’t say this is misleading, but " a single relation" is indeed
> not precise enough. IMO we need a more precise term to distinguish
> regular relation and table func.

I'm not sure.  See the definition of relation in the glossary:
https://www.postgresql.org/docs/18/glossary.html#GLOSSARY-RELATION

  The generic term for all objects in a database that have a name and a
  list of attributes defined in a specific order. Tables, sequences,
  views, foreign tables, materialized views, composite types, and
  indexes are all relations.

  More generically, a relation is a set of tuples; for example, the
  result of a query is also a relation.

  In PostgreSQL, Class is an archaic synonym for relation.

(I wonder why this says "generically" rather than "generally".  Is that
word choice a mistake?)  Maybe in the "For example" clause we can also
mention table functions.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Kirill Reshke

Дата:

22 августа, 13:00:17

On Fri, 22 Aug 2025 at 14:46, Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> On 2025-Aug-21, Kirill Reshke wrote:
>
> > I wouldn’t say this is misleading, but " a single relation" is indeed
> > not precise enough. IMO we need a more precise term to distinguish
> > regular relation and table func.
>
> I'm not sure.  See the definition of relation in the glossary:
> https://www.postgresql.org/docs/18/glossary.html#GLOSSARY-RELATION
>
>   The generic term for all objects in a database that have a name and a
>   list of attributes defined in a specific order. Tables, sequences,
>   views, foreign tables, materialized views, composite types, and
>   indexes are all relations.
>
>   More generically, a relation is a set of tuples; for example, the
>   result of a query is also a relation.
>
>   In PostgreSQL, Class is an archaic synonym for relation.
>
> (I wonder why this says "generically" rather than "generally".  Is that
> word choice a mistake?)  Maybe in the "For example" clause we can also
> mention table functions.
>
> --
> Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/

I am sorry: I am not following. CREATE STATISTIC will work only for
single HEAP (or other AM) relations. So, for "simple regular tables"
as one can say (not tablefunc).

You say: relation is a term for both HEAP relation, tablefunc relation
and much more.
I say:  "a single relation" in the error message is not precise enough.

Where do we disagree?

Anyway, I would say correct error message here should be:

```
db=# CREATE STATISTICS alt_stat2 ON a, b FROM tftest(1);
ERROR:  cannot define statistics for relation "alt_stat2"
DETAIL:  This operation is not supported for query result.
```

--
Best regards,
Kirill Reshke

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

jian he

Дата:

22 августа, 13:17:19

hi.

will
+               if (!IsA(rln, RangeVar))
+                       ereport(ERROR,
+                                       errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                                       errmsg("CREATE STATISTICS only
supports regular tables, materialized views, foreign tables, and
partitioned tables"));

solve the problem?

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Richard Guo

Дата:

22 августа, 17:19:49

On Fri, Aug 22, 2025 at 6:46 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
> I'm not sure.  See the definition of relation in the glossary:
> https://www.postgresql.org/docs/18/glossary.html#GLOSSARY-RELATION
>
>   The generic term for all objects in a database that have a name and a
>   list of attributes defined in a specific order. Tables, sequences,
>   views, foreign tables, materialized views, composite types, and
>   indexes are all relations.
>
>   More generically, a relation is a set of tuples; for example, the
>   result of a query is also a relation.
>
>   In PostgreSQL, Class is an archaic synonym for relation.
>
> (I wonder why this says "generically" rather than "generally".  Is that
> word choice a mistake?)

"More generally" feels more natural to me than "more generically" in
this sentence, but I'm not a native English speaker, so I could be
wrong.

Thanks
Richard

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Tom Lane

Дата:

22 августа, 17:37:24

=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
> I'm not sure.  See the definition of relation in the glossary:
> https://www.postgresql.org/docs/18/glossary.html#GLOSSARY-RELATION

>   The generic term for all objects in a database that have a name and a
>   list of attributes defined in a specific order. Tables, sequences,
>   views, foreign tables, materialized views, composite types, and
>   indexes are all relations.

>   More generically, a relation is a set of tuples; for example, the
>   result of a query is also a relation.

>   In PostgreSQL, Class is an archaic synonym for relation.

> (I wonder why this says "generically" rather than "generally".  Is that
> word choice a mistake?)  Maybe in the "For example" clause we can also
> mention table functions.

Yeah, I think s/generically/generally/ would be an improvement.

I'm not certain, but I think that our use of "relation" to mean
"an object with a pg_class entry" is a Postgres-ism.  I do know
that the meaning of "a set of tuples" is widely used, as that's
where the term "relational database" comes from.  Maybe whoever
wrote this was trying to get at that point?  But this text is
hardly clear about that.

            regards, tom lane

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Álvaro Herrera

Дата:

22 августа, 18:34:06

On 2025-Aug-22, Kirill Reshke wrote:

> I am sorry: I am not following. CREATE STATISTIC will work only for
> single HEAP (or other AM) relations. So, for "simple regular tables"
> as one can say (not tablefunc).

Ah yeah, you're right, I think we should say "a single table", or maybe
"a single table or materialized view" if the latter case is supported (I
don't remember offhand if it is).

The error message is mostly concerned with rejecting the case of
multiple relations being given in the FROM clause, because we said at
the time that we needed to make the grammar flexible enough to support
that case, but make it clear that it wasn't implemented yet.  (It's a
pity that we haven't implemented that thus far ... I suppose it must be
a difficult problem.)

> DETAIL:  This operation is not supported for query result.

I would say "This operation is only supported for tables [and
matviews]."  We don't need to list all the things that aren't supported,
I think.  But your example here is wrong:

> db=# CREATE STATISTICS alt_stat2 ON a, b FROM tftest(1);
> ERROR:  cannot define statistics for relation "alt_stat2"
> DETAIL:  This operation is not supported for query result.

It's not alt_stat2 that's the problem (that's the extstats object being
created), but tftest(1).  I don't know if we need to specify that the
problem is in the FROM clause here -- seems obvious enough to me -- but
maybe someone opines differently?

ERROR:  cannot define statistics for relation "tftest"
DETAIL:  The FROM clause may only contain a single table.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Si quieres ser creativo, aprende el arte de perder el tiempo"

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Álvaro Herrera

Дата:

22 августа, 18:35:20

On 2025-Aug-22, Tom Lane wrote:

> I'm not certain, but I think that our use of "relation" to mean
> "an object with a pg_class entry" is a Postgres-ism.  I do know
> that the meaning of "a set of tuples" is widely used, as that's
> where the term "relational database" comes from.  Maybe whoever
> wrote this was trying to get at that point?  But this text is
> hardly clear about that.

Yeah, AFAIR I wrote it (or at least heavily edited the original to get
to that), and I was trying to convey exactly those two ideas.  If you
want to propose improvements, they're very welcome.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Tom Lane

Дата:

22 августа, 18:50:52

=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
> Yeah, AFAIR I wrote it (or at least heavily edited the original to get
> to that), and I was trying to convey exactly those two ideas.  If you
> want to propose improvements, they're very welcome.

Hmmm ... maybe something like

    Mathematically, a "relation" is a set of tuples; this is the sense
    meant in the term "relational database".

    In Postgres, "relation" is commonly used to mean a database object
    that has a name and a list of attributes defined in a specific
    order. Tables, sequences, views, foreign tables, materialized
    views, composite types, and indexes are all relations. A relation
    in this sense is a container or descriptor for a set of tuples.

    "Class" is an alternative but archaic term.  The system catalog
    pg_class holds an entry for each Postgres relation.

            regards, tom lane

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Álvaro Herrera

Дата:

28 августа, 19:23:46

On 2025-Aug-22, Tom Lane wrote:

> Hmmm ... maybe something like
> 
>     Mathematically, a "relation" is a set of tuples; this is the sense
>     meant in the term "relational database".
> 
>     In Postgres, "relation" is commonly used to mean a database object
>     that has a name and a list of attributes defined in a specific
>     order. Tables, sequences, views, foreign tables, materialized
>     views, composite types, and indexes are all relations. A relation
>     in this sense is a container or descriptor for a set of tuples.
> 
>     "Class" is an alternative but archaic term.  The system catalog
>     pg_class holds an entry for each Postgres relation.

Thanks, pushed like that.  I changed "a database object" to "an SQL
object", because that's a term we have a definition for.

(I also wrote "PostgreSQL" where you had "Postgres".  I think it might
be okay now to change the product name in various places here, but it
seems better to do it consistently across the whole page.)

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"¿Qué importan los años?  Lo que realmente importa es comprobar que
a fin de cuentas la mejor edad de la vida es estar vivo"  (Mafalda)

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Tom Lane

Дата:

28 августа, 20:17:24

=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
> Thanks, pushed like that.  I changed "a database object" to "an SQL
> object", because that's a term we have a definition for.
> (I also wrote "PostgreSQL" where you had "Postgres".  I think it might
> be okay now to change the product name in various places here, but it
> seems better to do it consistently across the whole page.)

LGTM, thanks.  (My wording was only meant as a draft ...)

            regards, tom lane

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Álvaro Herrera

Дата:

29 августа, 00:33:42

On 2025-Aug-21, jian he wrote:

>             case T_CreateStatsStmt:
>                 {
>                     Oid            relid;
>                     CreateStatsStmt *stmt = (CreateStatsStmt *) parsetree;
>                     RangeVar   *rel = (RangeVar *) linitial(stmt->relations);
> 
>                     if (!IsA(rel, RangeVar))
>                         ereport(ERROR,
>                                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                                  errmsg("only a single relation is
> allowed in CREATE STATISTICS")));
>                     relid = RangeVarGetRelid(rel,
> ShareUpdateExclusiveLock, false);
>                     /* Run parse analysis ... */
>                     stmt = transformStatsStmt(relid, stmt, queryString);
>                     address = CreateStatistics(stmt);
>                 }

Yeah, I think there's room to argue that this ereport() is just wrongly
copy-and-pasted from other nearby errors, and that it should have
completely different wording.  BTW another way to cause this is

CREATE STATISTICS alt_stat2 ON a, b FROM xmltable('foo' passing 'bar' columns a text);

I propose to use this report:

ERROR:  cannot create statistics on specified relation
DETAIL:  CREATE STATISTICS only supports tables, materialized views, foreign tables, and partitioned tables.

(Not sold on saying just "tables" vs. "regular tables" or "plain tables";
thoughts?)

While looking at this whole thing, I noticed that this code is somewhat
bogus in a different way: we're resolving the relation name twice, both
here in ProcessUtilitySlow() (to pass to transformStatsStmt), and later
again inside CreateStatistics().  It's really strange that we decided
not to pass the relation Oids as a list argument to CreateStatistics.  I
suggest to change that as the first attached patch.

Now, such a change would be appropriate only for branch master; it seems too
invasive for stable ones.  For them I propose we only change the error message,
as in the other attachment.

We should add a couple of test cases for this particular error message
also.  It's bad that these changes don't break any tests.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Once again, thank you and all of the developers for your hard work on
PostgreSQL.  This is by far the most pleasant management experience of
any database I've worked on."                             (Dan Harris)
http://archives.postgresql.org/pgsql-performance/2006-04/msg00247.php

Вложения

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Tom Lane

Дата:

29 августа, 00:46:39

=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
> I propose to use this report:

> ERROR:  cannot create statistics on specified relation
> DETAIL:  CREATE STATISTICS only supports tables, materialized views, foreign tables, and partitioned tables.

WFM, although I think you could shorten it to "tables, materialized
views, and foreign tables".  We generally expect that partitioned
tables are included when saying "tables", no?  I'm not dead set on
that either way, though.

> While looking at this whole thing, I noticed that this code is somewhat
> bogus in a different way: we're resolving the relation name twice, both
> here in ProcessUtilitySlow() (to pass to transformStatsStmt), and later
> again inside CreateStatistics().  It's really strange that we decided
> not to pass the relation Oids as a list argument to CreateStatistics.  I
> suggest to change that as the first attached patch.

I think this is a good idea, but I'm not sure that the locking
considerations are straight in this version.  Once we translate a
relation name to OID, we'd better hold lock on the rel continuously to
the end of usage of that OID.  It looks like you do, but then couldn't
the

+        rel = relation_open(relid, ShareUpdateExclusiveLock);

in CreateStatistics use NoLock to signify that we expect to have
the lock already?

Alternatively, maybe the right fix is to move the parse-analysis
work into CreateStatistics altogether.  I think there is not a
very good argument for ProcessUtilitySlow doing all that stuff.
It's supposed to mainly just be a dispatching switch(), IMO.

            regards, tom lane

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

jian he

Дата:

29 августа, 07:07:24

On Fri, Aug 29, 2025 at 5:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> =?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
> > I propose to use this report:
>
> > ERROR:  cannot create statistics on specified relation
> > DETAIL:  CREATE STATISTICS only supports tables, materialized views, foreign tables, and partitioned tables.
>
> WFM, although I think you could shorten it to "tables, materialized
> views, and foreign tables".  We generally expect that partitioned
> tables are included when saying "tables", no?  I'm not dead set on
> that either way, though.
>

https://www.postgresql.org/docs/current/sql-copy.html
use "COPY TO can be used only with plain tables, not views, and does
not copy rows from child tables or child partitions"

>
> Alternatively, maybe the right fix is to move the parse-analysis
> work into CreateStatistics altogether.  I think there is not a
> very good argument for ProcessUtilitySlow doing all that stuff.
> It's supposed to mainly just be a dispatching switch(), IMO.
>
seems doable.
transformStatsStmt, CreateStatistics both used only twice, refactoring arguments
should be fine.
please check the attached POC, regress tests also added.

main idea is
first check CreateStatsStmt->relations,
then call transformStatsStmt, transformStatsStmt only to transform
CreateStatsStmt->exprs.
also the above complaint about the relation lock issue will be resolved.

Вложения

v2-0001-refactor-CreateStatsStmt.patch

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Álvaro Herrera

Дата:

29 августа, 15:48:21

On 2025-Aug-29, jian he wrote:

> On Fri, Aug 29, 2025 at 5:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

> > WFM, although I think you could shorten it to "tables, materialized
> > views, and foreign tables".  We generally expect that partitioned
> > tables are included when saying "tables", no?  I'm not dead set on
> > that either way, though.
> 
> https://www.postgresql.org/docs/current/sql-copy.html
> use "COPY TO can be used only with plain tables, not views, and does
> not copy rows from child tables or child partitions"

I'm inclined to think that we should only mention partitioned tables
specifically when they for some reason deviate from what we do for
regular tables, i.e., what Tom is saying.  I don't think we've had an
explicit, consistent rule for that thus far, so there may be places
where we fail to follow it.

Anyway, I have pushed the error message change.

> > Alternatively, maybe the right fix is to move the parse-analysis
> > work into CreateStatistics altogether.  I think there is not a
> > very good argument for ProcessUtilitySlow doing all that stuff.
> > It's supposed to mainly just be a dispatching switch(), IMO.
>
> seems doable.
> transformStatsStmt, CreateStatistics both used only twice, refactoring
> arguments should be fine.
> please check the attached POC, regress tests also added.

Yeah, I like how this turned out.  I found out this was introduced in
commit a4d75c86bf15.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

jian he

Дата:

04 сентября, 07:54:37

On Fri, Aug 29, 2025 at 8:48 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> > > Alternatively, maybe the right fix is to move the parse-analysis
> > > work into CreateStatistics altogether.  I think there is not a
> > > very good argument for ProcessUtilitySlow doing all that stuff.
> > > It's supposed to mainly just be a dispatching switch(), IMO.
> >
> > seems doable.
> > transformStatsStmt, CreateStatistics both used only twice, refactoring
> > arguments should be fine.
> > please check the attached POC, regress tests also added.
>
> Yeah, I like how this turned out.  I found out this was introduced in
> commit a4d75c86bf15.

Previously, CreateStatistics and ProcessUtilitySlow opened the same relation
twice with a ShareUpdateExclusiveLock.  This refactor ensures the relation is
opened with ShareUpdateExclusiveLock only once and also reduces redundant error
checks.
The logic is now more intuitive: we first error checking
CreateStatsStmt->relations and then call transformStatsStmt to parse analysis
CreateStatsStmt->exprs.

please check the attached refactor CreateStatsStmt.

Вложения

v3-0001-refactor-CreateStatsStmt.patch

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Peter Eisentraut

Дата:

05 сентября, 15:30:05

On 29.08.25 14:48, Álvaro Herrera wrote:
> On 2025-Aug-29, jian he wrote:
> 
>> On Fri, Aug 29, 2025 at 5:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
>>> WFM, although I think you could shorten it to "tables, materialized
>>> views, and foreign tables".  We generally expect that partitioned
>>> tables are included when saying "tables", no?  I'm not dead set on
>>> that either way, though.
>>
>> https://www.postgresql.org/docs/current/sql-copy.html
>> use "COPY TO can be used only with plain tables, not views, and does
>> not copy rows from child tables or child partitions"
> 
> I'm inclined to think that we should only mention partitioned tables
> specifically when they for some reason deviate from what we do for
> regular tables, i.e., what Tom is saying.  I don't think we've had an
> explicit, consistent rule for that thus far, so there may be places
> where we fail to follow it.
> 
> Anyway, I have pushed the error message change.

I think this message is still wrong.  The check doesn't even look at the 
relation kind, which is what the message is implying.  (If the message 
were about relkinds, then it should use 
errdetail_relkind_not_supported().)  It checks that the from list entry 
is a table name instead of some other thing like VALUES or JOIN.  So it 
should be something like

CREATE STATISTICS only supports plain table names in the FROM clause

The same could also be accomplished by changing the grammar from

     FROM from_list

to

     FROM qualified_name_list

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Peter Eisentraut

Дата:

12 сентября, 16:41:18

On 05.09.25 14:30, Peter Eisentraut wrote:
> On 29.08.25 14:48, Álvaro Herrera wrote:
>> On 2025-Aug-29, jian he wrote:
>>
>>> On Fri, Aug 29, 2025 at 5:46 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>>>> WFM, although I think you could shorten it to "tables, materialized
>>>> views, and foreign tables".  We generally expect that partitioned
>>>> tables are included when saying "tables", no?  I'm not dead set on
>>>> that either way, though.
>>>
>>> https://www.postgresql.org/docs/current/sql-copy.html
>>> use "COPY TO can be used only with plain tables, not views, and does
>>> not copy rows from child tables or child partitions"
>>
>> I'm inclined to think that we should only mention partitioned tables
>> specifically when they for some reason deviate from what we do for
>> regular tables, i.e., what Tom is saying.  I don't think we've had an
>> explicit, consistent rule for that thus far, so there may be places
>> where we fail to follow it.
>>
>> Anyway, I have pushed the error message change.
> 
> I think this message is still wrong.  The check doesn't even look at the 
> relation kind, which is what the message is implying.  (If the message 
> were about relkinds, then it should use 
> errdetail_relkind_not_supported().)  It checks that the from list entry 
> is a table name instead of some other thing like VALUES or JOIN.  So it 
> should be something like
> 
> CREATE STATISTICS only supports plain table names in the FROM clause

I propose the attached patch to fix this.  I think this restores the 
original meaning better.

Вложения

v4-0001-CREATE-STATISTICS-improve-misleading-error-messag.patch

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Tom Lane

Дата:

12 сентября, 16:49:44

Peter Eisentraut <peter@eisentraut.org> writes:
> I propose the attached patch to fix this.  I think this restores the 
> original meaning better.

I'm okay with this wording change, but I would stay with
ERRCODE_FEATURE_NOT_SUPPORTED rather than calling this
a "syntax error".  It's not a syntax error IMV, but a
potential feature that we have deliberately left syntax
space for, even though we don't yet have ideas about
a workable implementation.

            regards, tom lane

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Peter Eisentraut

Дата:

15 сентября, 13:16:39

On 12.09.25 15:49, Tom Lane wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
>> I propose the attached patch to fix this.  I think this restores the
>> original meaning better.
> 
> I'm okay with this wording change, but I would stay with
> ERRCODE_FEATURE_NOT_SUPPORTED rather than calling this
> a "syntax error".  It's not a syntax error IMV, but a
> potential feature that we have deliberately left syntax
> space for, even though we don't yet have ideas about
> a workable implementation.

Ok, done that way.

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

jian he

Дата:

02 октября, 06:30:23

hi.

moving all code in ProcessUtilitySlow (case T_CreateStatsStmt:)
to CreateStatistic seems more intuitive to me.

so I rebased v3.

Вложения

v5-0001-refactor-CreateStatsStmt.patch

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Álvaro Herrera

Дата:

17 октября, 15:13:08

On 2025-Oct-02, jian he wrote:

> hi.
> 
> moving all code in ProcessUtilitySlow (case T_CreateStatsStmt:)
> to CreateStatistic seems more intuitive to me.
> 
> so I rebased v3.

Yeah, this looks good.  But I think we can go a little further.  Because
CreateStatistics() now calls transformStatsStmt(), we don't need to do
the same in ATPostAlterTypeParse(), which allows us to simplify the code
there.  In turn this means we can pass the whole Relation rather than
merely the OID, so transformStatsStmt doesn't need to open it.  I attach
v4 with those changes.  Comments?

For a moment it looked weird to me to pass a Relation to the parse
analysis routine, but I see that other functions declared in
parse_utilcmd.h are already doing that.

One more thing I noticed is that we're scribbling on the parser node,
which we can easily avoid by creating a copy of the node in
transformStatsStmt() before doing any changes.  I'm not too clear
on whether this is really necessary or not.  We do it in other places,
but we haven't done it here for a long while, and nothing seems to break
because of that.

diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index 89c8315117b..7797c707f73 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -3150,6 +3150,9 @@ transformStatsStmt(Relation rel, CreateStatsStmt *stmt, const char *queryString)
     if (stmt->transformed)
         return stmt;

+    /* create a copy to scribble on */
+    stmt = copyObject(stmt);
+
     /* Set up pstate */
     pstate = make_parsestate(NULL);
     pstate->p_sourcetext = queryString;

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Uno puede defenderse de los ataques; contra los elogios se esta indefenso"

Вложения

v4-0001-Restructure-CreateStatsStmt-parse-analysis-proces.patch

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

jian he

Дата:

20 октября, 06:41:39

On Fri, Oct 17, 2025 at 8:13 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> Yeah, this looks good.  But I think we can go a little further.  Because
> CreateStatistics() now calls transformStatsStmt(), we don't need to do
> the same in ATPostAlterTypeParse(), which allows us to simplify the code
> there.  In turn this means we can pass the whole Relation rather than
> merely the OID, so transformStatsStmt doesn't need to open it.  I attach
> v4 with those changes.  Comments?
>

/*
 * transformStatsStmt - parse analysis for CREATE STATISTICS
 *
 * To avoid race conditions, it's important that this function relies only on
 * the passed-in rel (and not on stmt->relation) as the target relation.
 */
CreateStatsStmt *
transformStatsStmt(Relation rel, CreateStatsStmt *stmt, const char *queryString)
{
......
}

the (Relation rel) effectively comes from "stmt->relations", which
conflict with the above
comments.


> For a moment it looked weird to me to pass a Relation to the parse
> analysis routine, but I see that other functions declared in
> parse_utilcmd.h are already doing that.
>
>
> One more thing I noticed is that we're scribbling on the parser node,
> which we can easily avoid by creating a copy of the node in
> transformStatsStmt() before doing any changes.  I'm not too clear
> on whether this is really necessary or not.  We do it in other places,
> but we haven't done it here for a long while, and nothing seems to break
> because of that.
>
> diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
> index 89c8315117b..7797c707f73 100644
> --- a/src/backend/parser/parse_utilcmd.c
> +++ b/src/backend/parser/parse_utilcmd.c
> @@ -3150,6 +3150,9 @@ transformStatsStmt(Relation rel, CreateStatsStmt *stmt, const char *queryString)
>         if (stmt->transformed)
>                 return stmt;
>
> +       /* create a copy to scribble on */
> +       stmt = copyObject(stmt);
> +
>         /* Set up pstate */
>         pstate = make_parsestate(NULL);
>         pstate->p_sourcetext = queryString;
>
in src/backend/parser/parse_utilcmd.c, we have only two occurences of
copyObject.

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Álvaro Herrera

Дата:

20 октября, 10:31:38

On 2025-Oct-20, jian he wrote:

> On Fri, Oct 17, 2025 at 8:13 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:

> /*
>  * transformStatsStmt - parse analysis for CREATE STATISTICS
>  *
>  * To avoid race conditions, it's important that this function relies only on
>  * the passed-in rel (and not on stmt->relation) as the target relation.
>  */
> CreateStatsStmt *
> transformStatsStmt(Relation rel, CreateStatsStmt *stmt, const char *queryString)

> the (Relation rel) effectively comes from "stmt->relations", which
> conflict with the above comments.

Hmm?  It does not.  The whole point is that the relation name (RangeVar
stmt->relations) has already been resolved to an OID and locked, which
is what we pass as 'Relation rel'.  Trying to resolve the name to an OID
again opens us up for race conditions.  This is alleviated if the
relation has already been locked, so maybe we can relax this comment;
but it's not outright contradictory AFAICS.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Oh, great altar of passive entertainment, bestow upon me thy discordant images
at such speed as to render linear thought impossible" (Calvin a la TV)

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

jian he

Дата:

20 октября, 15:03:06

On Mon, Oct 20, 2025 at 3:31 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> On 2025-Oct-20, jian he wrote:
>
> > On Fri, Oct 17, 2025 at 8:13 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> > /*
> >  * transformStatsStmt - parse analysis for CREATE STATISTICS
> >  *
> >  * To avoid race conditions, it's important that this function relies only on
> >  * the passed-in rel (and not on stmt->relation) as the target relation.
> >  */
> > CreateStatsStmt *
> > transformStatsStmt(Relation rel, CreateStatsStmt *stmt, const char *queryString)
>
> > the (Relation rel) effectively comes from "stmt->relations", which
> > conflict with the above comments.
>
> Hmm?  It does not.  The whole point is that the relation name (RangeVar
> stmt->relations) has already been resolved to an OID and locked, which
> is what we pass as 'Relation rel'.  Trying to resolve the name to an OID
> again opens us up for race conditions.  This is alleviated if the
> relation has already been locked, so maybe we can relax this comment;
> but it's not outright contradictory AFAICS.
>

I think I understand your point.
When ALTER TABLE SET DATA TYPE invokes CreateStatistics, if the Relation (rel)
returned by CreateStatistics->relation_openrv is not the same as
ATPostAlterTypeParse.oldRelId, the regression test would already fail.


@@ -15632,11 +15623,6 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId,
Oid refRelId, char *cmd,
  querytree_list = lappend(querytree_list, stmt);
  querytree_list = list_concat(querytree_list, afterStmts);
  }
- else if (IsA(stmt, CreateStatsStmt))
- querytree_list = lappend(querytree_list,
- transformStatsStmt(oldRelId,
- (CreateStatsStmt *) stmt,
- cmd));

The above "cmd" is the queryString, which is useful for error reporting.

create table t(a  int);
create statistics xxx on (a + 1 is not null) from t;
alter table t alter column a set data type text;

with patch, v4-0001-Restructure-CreateStatsStmt-parse-analysis-proces.patch
ERROR:  operator does not exist: text + integer
DETAIL:  No operator of that name accepts the given argument types.
HINT:  You might need to add explicit type casts.


In the master branch, the error message also includes the error position.
ERROR:  operator does not exist: text + integer
LINE 1: alter table t alter column a set data type text;
                                            ^
DETAIL:  No operator of that name accepts the given argument types.
HINT:  You might need to add explicit type casts.

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Álvaro Herrera

Дата:

20 октября, 18:29:21

On 2025-Oct-20, jian he wrote:

> @@ -15632,11 +15623,6 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId,
> Oid refRelId, char *cmd,
>   querytree_list = lappend(querytree_list, stmt);
>   querytree_list = list_concat(querytree_list, afterStmts);
>   }
> - else if (IsA(stmt, CreateStatsStmt))
> - querytree_list = lappend(querytree_list,
> - transformStatsStmt(oldRelId,
> - (CreateStatsStmt *) stmt,
> - cmd));
> 
> The above "cmd" is the queryString, which is useful for error reporting.
> 
> create table t(a  int);
> create statistics xxx on (a + 1 is not null) from t;
> alter table t alter column a set data type text;
> 
> with patch, v4-0001-Restructure-CreateStatsStmt-parse-analysis-proces.patch
> ERROR:  operator does not exist: text + integer
> DETAIL:  No operator of that name accepts the given argument types.
> HINT:  You might need to add explicit type casts.
> 
> In the master branch, the error message also includes the error position.
> ERROR:  operator does not exist: text + integer
> LINE 1: alter table t alter column a set data type text;
>                                             ^
> DETAIL:  No operator of that name accepts the given argument types.
> HINT:  You might need to add explicit type casts.

Interesting, thanks for the example.

I think this example illustrates very well why the 'cmd' is rather
useless here -- note how the error message refers to an operator that
appears nowhere in the query string.  The user says "SET DATA TYPE text"
and then we complain about the "text+integer" operator?  That makes no
sense and I don't think it's in any way useful.

I think a user-friendly thing we could do is add an errcontext callback
that shows that we're trying to reapply a statistics object.  We should
discuss that in a separate thread for a separate patch though, and also
I don't think that thread should prevent this rework from being applied
now.  We could add this test scenario to the regression tests though, if
only to show how it would change later.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

jian he

Дата:

13 ноября, 06:44:45

hi.

new patch attached.

Summary of the problem we’re trying to solve:
1. ProcessUtility for T_CreateStatsStmt, there’s a lot of code that doesn’t
belong there.
2. The current pattern of repeatedly performing RangeVar lookups isn’t ideal —
we should perform the lookup once and then use the corresponding relation OID
for subsequent operations.

The main part this patch is copied from
https://postgr.es/m/CA+TgmobyMXzoEzscCRDCggHRCTp1TW=Dm9pEhmwOYKos43WDAg@mail.gmail.com

Below is a copy of the commit message.
------------------
Previously, a lot of code would be run in the ProcessUtility pipeline,
where it is not welcome (that's supposed to be just a giant dispatcher
switch).  Move the parse analysis code to CreateStatistics() instead,
which then means we don't need to open the relation twice; it also
allows us to give a better error message when something other than a
relation name is given in the FROM clause.

Generally we should avoid look up the same less-than-fully-qualified name
multiple times, we might get different answers due to concurrent activity, and
that might create a security vulnerability, along the lines of CVE-2014-0062.

Refactor so that a CreateStatsStmt carries both the untransformed FROM clause
and the range table that we get when it's transformed. That way, once the name
lookups have been done, we can propagate the results forward to future
processing steps and avoid ever repeating the lookup.
------------------

--
jian
https://www.enterprisedb.com/

Вложения

v5-0001-Restructure-CreateStatsStmt-parse-analysis-processing.patch

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

Álvaro Herrera

Дата:

13 ноября, 11:24:52

On 2025-Nov-13, jian he wrote:

> @@ -15658,10 +15658,19 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
>              querytree_list = list_concat(querytree_list, afterStmts);
>          }
>          else if (IsA(stmt, CreateStatsStmt))
> +        {
> +            RangeTblEntry     *rte;
> +            CreateStatsStmt *ss = castNode(CreateStatsStmt, stmt);
> +
> +            rte = makeNode(RangeTblEntry);
> +            rte->rtekind = RTE_RELATION;
> +            rte->relid = oldRelId;
> +            rte->rellockmode = ShareUpdateExclusiveLock;
> +            ss->rtable = list_make1(rte);
> +
>              querytree_list = lappend(querytree_list,
> -                                     transformStatsStmt(oldRelId,
> -                                                        (CreateStatsStmt *) stmt,
> -                                                        cmd));
> +                                     transformStatsStmt(ss, cmd));
> +        }
>          else
>              querytree_list = lappend(querytree_list, stmt);
>      }

Hmm, how would this part here work in the hypothetical world where a
stats object references multiple relations?

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"No renuncies a nada. No te aferres a nada."

Re: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

От

jian he

Дата:

14 ноября, 07:06:49

On Thu, Nov 13, 2025 at 5:41 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> On 2025-Nov-13, jian he wrote:
>
> > @@ -15658,10 +15658,19 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
> >                       querytree_list = list_concat(querytree_list, afterStmts);
> >               }
> >               else if (IsA(stmt, CreateStatsStmt))
> > +             {
> > +                     RangeTblEntry   *rte;
> > +                     CreateStatsStmt *ss = castNode(CreateStatsStmt, stmt);
> > +
> > +                     rte = makeNode(RangeTblEntry);
> > +                     rte->rtekind = RTE_RELATION;
> > +                     rte->relid = oldRelId;
> > +                     rte->rellockmode = ShareUpdateExclusiveLock;
> > +                     ss->rtable = list_make1(rte);
> > +
> >                       querytree_list = lappend(querytree_list,
> > -                                                                      transformStatsStmt(oldRelId,
> > -
(CreateStatsStmt*) stmt, 
> > -
cmd));
> > +                                                                      transformStatsStmt(ss, cmd));
> > +             }
> >               else
> >                       querytree_list = lappend(querytree_list, stmt);
> >       }
>
> Hmm, how would this part here work in the hypothetical world where a
> stats object references multiple relations?
>
hi.

per
https://www.postgresql.org/docs/current/catalog-pg-statistic-ext.html
extended statistics either based on column name or expression.

If based on column name, imagine a simple case:
CREATE STATISTICS ON a FROM t1,t2;
pg_statistic_ext obviously needs another column to store other relation oids.
in this case, we can use AlteredTableInfo->changedStatisticsOids to lookup
catalog table pg_statistic_ext to find out the associated relation oids

If extended statistics is based on expression, imagine a simple case:
CREATE STATISTICS  on (t1.a + t2.a) from t1, t2;
ruleutils.c can not cope with expressions like: "(t1.a + t2.a)",
so we have to wrap the expression into a Query Node, store it in
pg_statistic_ext.stxexprs.

If pg_statistic_ext does not contain all associated relation
OIDs, all these pg_get_statisticsobjdef_expressions won't work

pg_statistic_ext should contain all associated relation OIDs, in my opinion. If
pg_statistic_ext does store all related relation OIDs, then the potential issue
you mentioned above should be solvable.

--
jian
https://www.enterprisedb.com/

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: misleading error message in ProcessUtilitySlow T_CreateStatsStmt

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения