Обсуждение: attndims, typndims still not enforced, but make the value within a sane threshold

Поиск
Список
Период
Сортировка

attndims, typndims still not enforced, but make the value within a sane threshold

От
jian he
Дата:
hi.

while looking at tablecmd.c, BuildDescForRelation
        attdim = list_length(entry->typeName->arrayBounds);
        if (attdim > PG_INT16_MAX)
            ereport(ERROR,
                    errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
                    errmsg("too many array dimensions"))

makes me related to array_in refactor previously we did.
at first, i thought it should be "if (attdim > MAXDIM)"


pg_attribute attndims description in [1]
attndims int2
Number of dimensions, if the column is an array type; otherwise 0.
(Presently, the number of dimensions of an array is not enforced, so
any nonzero value effectively means “it's an array”.)

pg_type typndims description in [2]
typndims int4
typndims is the number of array dimensions for a domain over an array
(that is, typbasetype is an array type). Zero for types other than
domains over array types.

since array_in is the only source of the real array data.
MAXDIM (6) ensure the max dimension is 6.

Can we error out at the stage "create table", "create domain"
time if the attndims or typndims is larger than MAXDIM (6) ?

for example, error out the following queries  immediately
create table t112(a int[][] [][] [][] [][][]);
create domain d_text_arr text [1][][][][][][][];

in the doc, we can still say "the number of dimensions of an array is
not enforced",
but attndims, typndims value would be within a sane threshold.

We can change typndims from int4 to int2,
so array type's dimension is consistent with domain type's dimension.
but it seems with the change, pg_type occupies the same amount of
storage as int4.


[1] https://www.postgresql.org/docs/current/catalog-pg-attribute.html
[2] https://www.postgresql.org/docs/current/catalog-pg-type.html



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Tom Lane
Дата:
jian he <jian.universality@gmail.com> writes:
> Can we error out at the stage "create table", "create domain"
> time if the attndims or typndims is larger than MAXDIM (6) ?

The last time this was discussed, I think the conclusion was
we should remove attndims and typndims entirely on the grounds
that they're useless.  I certainly don't see a point in adding
more logic that could give the misleading impression that they
mean something.

            regards, tom lane



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
jian he
Дата:
On Wed, Sep 18, 2024 at 10:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> jian he <jian.universality@gmail.com> writes:
> > Can we error out at the stage "create table", "create domain"
> > time if the attndims or typndims is larger than MAXDIM (6) ?
>
> The last time this was discussed, I think the conclusion was
> we should remove attndims and typndims entirely on the grounds
> that they're useless.  I certainly don't see a point in adding
> more logic that could give the misleading impression that they
> mean something.
>

https://commitfest.postgresql.org/43/
search "dim" or "pg_attribute", no relevant result,
i am assuming, nobody doing work to remove attndims and typndims entirely?
If so, I will try to make one.



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Junwang Zhao
Дата:
On Fri, Sep 20, 2024 at 10:11 AM jian he <jian.universality@gmail.com> wrote:
>
> On Wed, Sep 18, 2024 at 10:35 PM jian he <jian.universality@gmail.com> wrote:
> >
> > > The last time this was discussed, I think the conclusion was
> > > we should remove attndims and typndims entirely on the grounds
> > > that they're useless.  I certainly don't see a point in adding
> > > more logic that could give the misleading impression that they
> > > mean something.
> > >
>
>
> attached patch removes attndims and typndims entirely.
> some tests skipped in my local my machine, not skipped are all OK.

Should you also bump the catalog version?

--
Regards
Junwang Zhao



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Fri, Sep 20, 2024 at 11:51:49AM +0800, Junwang Zhao wrote:
>> Should you also bump the catalog version?

> No need to worry about that when sending a patch because committers
> take care of that when merging a patch into the tree.  Doing that in
> each patch submitted just creates more conflicts and work for patch
> authors because they'd need to recolve conflicts each time a
> catversion bump happens.  And that can happen on a daily basis
> sometimes depending on what is committed.

Right.  Sometimes the committer forgets to do that :-(, which is
not great but it's not normally a big problem either.  We've concluded
it's better to err in that direction than impose additional work
on patch submitters.

If you feel concerned about the point, best practice is to include a
mention that catversion bump is needed in your draft commit message.

            regards, tom lane



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Junwang Zhao
Дата:
Hi Tom and Michael,

On Fri, Sep 20, 2024 at 12:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Michael Paquier <michael@paquier.xyz> writes:
> > On Fri, Sep 20, 2024 at 11:51:49AM +0800, Junwang Zhao wrote:
> >> Should you also bump the catalog version?
>
> > No need to worry about that when sending a patch because committers
> > take care of that when merging a patch into the tree.  Doing that in
> > each patch submitted just creates more conflicts and work for patch
> > authors because they'd need to recolve conflicts each time a
> > catversion bump happens.  And that can happen on a daily basis
> > sometimes depending on what is committed.
>
> Right.  Sometimes the committer forgets to do that :-(, which is
> not great but it's not normally a big problem either.  We've concluded
> it's better to err in that direction than impose additional work
> on patch submitters.
>
> If you feel concerned about the point, best practice is to include a
> mention that catversion bump is needed in your draft commit message.
>
>                         regards, tom lane

Got it, thanks for both of your explanations.

--
Regards
Junwang Zhao



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Bruce Momjian
Дата:
On Fri, Sep 20, 2024 at 10:11:00AM +0800, jian he wrote:
> On Wed, Sep 18, 2024 at 10:35 PM jian he <jian.universality@gmail.com> wrote:
> >
> > > The last time this was discussed, I think the conclusion was
> > > we should remove attndims and typndims entirely on the grounds
> > > that they're useless.  I certainly don't see a point in adding
> > > more logic that could give the misleading impression that they
> > > mean something.
> > >
> 
> 
> attached patch removes attndims and typndims entirely.
> some tests skipped in my local my machine, not skipped are all OK.

I have been hoping for a patch links this because I feel the existence
of these system columns is deceptive since we don't honor them properly.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  When a patient asks the doctor, "Am I going to die?", he means 
  "Am I going to die soon?"



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Andrew Dunstan
Дата:
On 2024-09-20 Fr 12:38 AM, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
>> On Fri, Sep 20, 2024 at 11:51:49AM +0800, Junwang Zhao wrote:
>>> Should you also bump the catalog version?
>> No need to worry about that when sending a patch because committers
>> take care of that when merging a patch into the tree.  Doing that in
>> each patch submitted just creates more conflicts and work for patch
>> authors because they'd need to recolve conflicts each time a
>> catversion bump happens.  And that can happen on a daily basis
>> sometimes depending on what is committed.
> Right.  Sometimes the committer forgets to do that :-(, which is
> not great but it's not normally a big problem either.  We've concluded
> it's better to err in that direction than impose additional work
> on patch submitters.


FWIW, I have a git pre-commit hook that helps avoid that. Essentially it 
checks to see if there are changes in src/include/catalog but not in 
catversion.h. That's not a 100% check, but it probably catches the vast 
majority of changes that would require a catversion bump.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Kirill Reshke
Дата:
On Fri, 20 Sept 2024 at 07:11, jian he <jian.universality@gmail.com> wrote:
> attached patch removes attndims and typndims entirely.

LGTM?
I see no open items in this thread. Are there any issues about v1?



-- 
Best regards,
Kirill Reshke



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Alvaro Herrera
Дата:
On 2024-Dec-05, Kirill Reshke wrote:

> On Fri, 20 Sept 2024 at 07:11, jian he <jian.universality@gmail.com> wrote:
> > attached patch removes attndims and typndims entirely.
> 
> LGTM?
> I see no open items in this thread. Are there any issues about v1?

Should we leave TupleDescInitEntry()'s API alone, to avoid breaking the
compilation of every extension in the world?  Maybe we could rename the
function (say to TupleDescInitializeEntry()) and use a define?

#define TupleDescInitEntry(a,b,c,d,e,f) TupleDescInitializeEntry(a,b,c,d,e)

Then the amount of churn is reduced, no extensions are broken, and we
can remove the define in a few years.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Debido a que la velocidad de la luz es mucho mayor que la del sonido,
 algunas personas nos parecen brillantes un minuto antes
 de escuchar las pelotudeces que dicen." (Roberto Fontanarrosa)



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2024-Dec-05, Kirill Reshke wrote:
>> I see no open items in this thread. Are there any issues about v1?

> Should we leave TupleDescInitEntry()'s API alone, to avoid breaking the
> compilation of every extension in the world?

I fear the howls of pain from extension authors will be entirely
drowned out by the howls of pain from application authors whose
queries are broken by the disappearance of two catalog columns.
A quick search at codesearch.debian.net shows that while there's
not that many references to attndims, some of them are in such
never-heard-of-it applications as PHP.  typndims has some outside
queries fetching it as well.

I don't think we can do this as presented.  Maybe we could do
something like constrain the stored value to be only 0 or 1,
but how much does that improve matters?

            regards, tom lane



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Bruce Momjian
Дата:
On Thu, Dec  5, 2024 at 11:33:22AM -0500, Tom Lane wrote:
> Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> > On 2024-Dec-05, Kirill Reshke wrote:
> >> I see no open items in this thread. Are there any issues about v1?
> 
> > Should we leave TupleDescInitEntry()'s API alone, to avoid breaking the
> > compilation of every extension in the world?
> 
> I fear the howls of pain from extension authors will be entirely
> drowned out by the howls of pain from application authors whose
> queries are broken by the disappearance of two catalog columns.
> A quick search at codesearch.debian.net shows that while there's
> not that many references to attndims, some of them are in such
> never-heard-of-it applications as PHP.  typndims has some outside
> queries fetching it as well.
> 
> I don't think we can do this as presented.  Maybe we could do
> something like constrain the stored value to be only 0 or 1,
> but how much does that improve matters?

Well, then the question becomes will we retain useless system columns
forever?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.





Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> On Thu, Dec  5, 2024 at 11:33:22AM -0500, Tom Lane wrote:
>> I don't think we can do this as presented.  Maybe we could do
>> something like constrain the stored value to be only 0 or 1,
>> but how much does that improve matters?

> Well, then the question becomes will we retain useless system columns
> forever?

If the choice is "not break PHP" versus "get rid of a vestigial column",
I know which one is the saner choice.  There might be things that are
worth breaking PHP for, but this isn't one.

            regards, tom lane



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Bruce Momjian
Дата:
On Thu, Dec  5, 2024 at 12:00:30PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > On Thu, Dec  5, 2024 at 11:33:22AM -0500, Tom Lane wrote:
> >> I don't think we can do this as presented.  Maybe we could do
> >> something like constrain the stored value to be only 0 or 1,
> >> but how much does that improve matters?
> 
> > Well, then the question becomes will we retain useless system columns
> > forever?
> 
> If the choice is "not break PHP" versus "get rid of a vestigial column",
> I know which one is the saner choice.  There might be things that are
> worth breaking PHP for, but this isn't one.

I guess I just don't want to get into a situation where we have so many
useless backward-compatible system columns it is distracting.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.





Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Bruce Momjian
Дата:
On Thu, Dec  5, 2024 at 12:03:30PM -0500, Bruce Momjian wrote:
> On Thu, Dec  5, 2024 at 12:00:30PM -0500, Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> > > On Thu, Dec  5, 2024 at 11:33:22AM -0500, Tom Lane wrote:
> > >> I don't think we can do this as presented.  Maybe we could do
> > >> something like constrain the stored value to be only 0 or 1,
> > >> but how much does that improve matters?
> > 
> > > Well, then the question becomes will we retain useless system columns
> > > forever?
> > 
> > If the choice is "not break PHP" versus "get rid of a vestigial column",
> > I know which one is the saner choice.  There might be things that are
> > worth breaking PHP for, but this isn't one.
> 
> I guess I just don't want to get into a situation where we have so many
> useless backward-compatible system columns it is distracting.

How about if we set attndims & typndims to zero in PG 18, and mention
that these fields are deprecated in the release notes.  Then, in a few
years, we can remove the columns since interfaces hopefully would have
removed the useless references to the columns.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.





Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> How about if we set attndims & typndims to zero in PG 18, and mention
> that these fields are deprecated in the release notes.  Then, in a few
> years, we can remove the columns since interfaces hopefully would have
> removed the useless references to the columns.

You have a mighty optimistic view of what will happen.  I predict
that if we do step (1), exactly nothing will happen in applications,
and step (2) will remain just as painful for them.  (Assuming that
we remember to do step (2), which is no sure thing given our track
record for following through "in a few years".)

            regards, tom lane



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> FWIW, my first thought after reading this paragraph is that you sound
> too dramatic here, especially after looking at codesearch to note that
> the PHP core code stores attndims but does not actually use it.

It appeared to me that it fetches it in order to return it in an
application inquiry function [1].  So to really gauge the damage,
you'd have to find out how many PHP applications pay attention
to the "array dims" field of pg_meta_data().  Maybe it's a trivial
number, or maybe not --- I didn't have the energy to search.

            regards, tom lane

[1] https://sources.debian.org/src/php8.2/8.2.26-4/ext/pgsql/pgsql.c/?hl=4245#L4245



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Bruce Momjian
Дата:
On Mon, Dec 23, 2024 at 01:56:24PM +0900, Michael Paquier wrote:
> On Thu, Dec 12, 2024 at 03:40:32PM +0800, jian he wrote:
> > remove pg_type typndims column patch attached.
> 
> FWIW, I have been paying more attention to applications that may use
> this attribute and bumped into quite a few cases that use quals based
> on (t.typndims > 0 AND t.typbasetype > 0) to check that they're
> dealing with domains over array types.  So even removing this switch
> would be unwise, I am afraid..

Well, I would be more excited about keeping the fields if they actually
were reliable in recording information.  This email from November 2023
explains the limitations of attndims:

   https://www.postgresql.org/message-id/flat/20150707072942.1186.98151@wrigleys.postgresql.org

*  dimensions not dumped by pg_dump
*  dimensions not propagated by CREATE TABLE ... (LIKE)
*  dimensions not propagated by CREATE TABLE AS
*  dimensions not displayed by psql

So, if users are referencing attndims and the values are not accurate,
how useful are they?  I was suggesting removal so people would stop
relying on inaccurate information, or we correct attndims to be
accurate.  Allowing people to continue relying on inaccurate information
seems like the worst of all options.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.





Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Tom Lane
Дата:
Bruce Momjian <bruce@momjian.us> writes:
> So, if users are referencing attndims and the values are not accurate,
> how useful are they?  I was suggesting removal so people would stop
> relying on inaccurate information, or we correct attndims to be
> accurate.  Allowing people to continue relying on inaccurate information
> seems like the worst of all options.

I think removal is not happening.  The documentation for attndims
already says

    Number of dimensions, if the column is an array type; otherwise
    0. (Presently, the number of dimensions of an array is not
    enforced, so any nonzero value effectively means “it's an array”.)

and as far as I know that is completely correct.  (Hence, your gripe
is not that it's not "accurate", rather that it's not "precise".)

There isn't a comparable disclaimer under typndims, but there probably
should be.

            regards, tom lane



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Bruce Momjian
Дата:
On Thu, Dec 26, 2024 at 05:08:36PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > So, if users are referencing attndims and the values are not accurate,
> > how useful are they?  I was suggesting removal so people would stop
> > relying on inaccurate information, or we correct attndims to be
> > accurate.  Allowing people to continue relying on inaccurate information
> > seems like the worst of all options.
> 
> I think removal is not happening.  The documentation for attndims
> already says
> 
>     Number of dimensions, if the column is an array type; otherwise
>     0. (Presently, the number of dimensions of an array is not
>     enforced, so any nonzero value effectively means “it's an array”.)
> 
> and as far as I know that is completely correct.  (Hence, your gripe
> is not that it's not "accurate", rather that it's not "precise".)
> 
> There isn't a comparable disclaimer under typndims, but there probably
> should be.

First, my apologies about the URL;  it should have been:

    https://www.postgresql.org/message-id/ZVwI_ozT8z9MCnIZ@momjian.us

Using the queries in that URL, I see:

    CREATE TABLE test (data integer, data_array integer[5][5]);

    CREATE TABLE test2 (LIKE test);

    CREATE TABLE test3 AS SELECT * FROM test;

    SELECT relname, attndims
    FROM pg_class JOIN pg_attribute ON (pg_attribute.attrelid = pg_class.oid)
    WHERE attname = 'data_array';

     relname | attndims
    ---------+----------
     test    |        2
-->     test2   |        0
-->     test3   |        0

Interestingly, if I dump and restore with:

    $ createdb test2; pg_dump test | sql test2

and run the query again I get:

    SELECT relname, attndims
    FROM pg_class JOIN pg_attribute ON (pg_attribute.attrelid = pg_class.oid)
    WHERE attname = 'data_array';

     relname | attndims
    ---------+----------
     test    |        1
     test2   |        1
     test3   |        1

There is clearly something not ideal about this behavior.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.





Bruce Momjian <bruce@momjian.us> writes:
> Using the queries in that URL, I see:

>     CREATE TABLE test (data integer, data_array integer[5][5]);
>     CREATE TABLE test2 (LIKE test);
>     CREATE TABLE test3 AS SELECT * FROM test;
>     SELECT relname, attndims
>     FROM pg_class JOIN pg_attribute ON (pg_attribute.attrelid = pg_class.oid)
>     WHERE attname = 'data_array';
>      relname | attndims
>     ---------+----------
>      test    |        2
> -->     test2   |        0
> -->     test3   |        0

Yeah, that's not great.  We don't have the ability to extract a
number-of-dimensions from a result column of a SELECT, but we could
at least take care to make attndims be 1 not 0 for an array type.
And CREATE TABLE LIKE can easily do better.  See attached draft.
(We could simplify it a little bit if we decide to store only 1 or 0
in all cases.)

> Interestingly, if I dump and restore with:
>     $ createdb test2; pg_dump test | sql test2
> and run the query again I get:
>      relname | attndims
>     ---------+----------
>      test    |        1
>      test2   |        1
>      test3   |        1

I looked at getting a better result here and decided that it didn't
look very promising.  pg_dump uses format_type() to build the type
name to put in CREATE TABLE, and that doesn't have access to attndims.

            regards, tom lane

diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 23cecd99c9..13fb6acfcf 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -183,7 +183,8 @@ create_ctas_nodata(List *tlist, IntoClause *into)
             col = makeColumnDef(colname,
                                 exprType((Node *) tle->expr),
                                 exprTypmod((Node *) tle->expr),
-                                exprCollation((Node *) tle->expr));
+                                exprCollation((Node *) tle->expr),
+                                -1 /* detect array-ness */ );

             /*
              * It's possible that the column is of a collatable type but the
@@ -492,10 +493,17 @@ intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
         else
             colname = NameStr(attribute->attname);

+        /*
+         * Note: we pass ndims as -1 not attribute->attndims because (a) the
+         * tupledesc we are given may not have accurate attndims, and (b) it
+         * seems best for this path to give results matching
+         * create_ctas_nodata.
+         */
         col = makeColumnDef(colname,
                             attribute->atttypid,
                             attribute->atttypmod,
-                            attribute->attcollation);
+                            attribute->attcollation,
+                            -1 /* detect array-ness */ );

         /*
          * It's possible that the column is of a collatable type but the
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index b13ee2b745..d5ca4ed5cb 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -178,15 +178,18 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
         switch (i)
         {
             case SEQ_COL_LASTVAL:
-                coldef = makeColumnDef("last_value", INT8OID, -1, InvalidOid);
+                coldef = makeColumnDef("last_value", INT8OID, -1,
+                                       InvalidOid, 0);
                 value[i - 1] = Int64GetDatumFast(seqdataform.last_value);
                 break;
             case SEQ_COL_LOG:
-                coldef = makeColumnDef("log_cnt", INT8OID, -1, InvalidOid);
+                coldef = makeColumnDef("log_cnt", INT8OID, -1,
+                                       InvalidOid, 0);
                 value[i - 1] = Int64GetDatum((int64) 0);
                 break;
             case SEQ_COL_CALLED:
-                coldef = makeColumnDef("is_called", BOOLOID, -1, InvalidOid);
+                coldef = makeColumnDef("is_called", BOOLOID, -1,
+                                       InvalidOid, 0);
                 value[i - 1] = BoolGetDatum(false);
                 break;
         }
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d2420a9558..111d96b896 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -2741,7 +2741,9 @@ MergeAttributes(List *columns, const List *supers, char relpersistence,
              * Create new column definition
              */
             newdef = makeColumnDef(attributeName, attribute->atttypid,
-                                   attribute->atttypmod, attribute->attcollation);
+                                   attribute->atttypmod,
+                                   attribute->attcollation,
+                                   attribute->attndims);
             newdef->storage = attribute->attstorage;
             newdef->generated = attribute->attgenerated;
             if (CompressionMethodIsValid(attribute->attcompression))
diff --git a/src/backend/commands/view.c b/src/backend/commands/view.c
index 6f0301555e..cd0919a935 100644
--- a/src/backend/commands/view.c
+++ b/src/backend/commands/view.c
@@ -64,7 +64,8 @@ DefineVirtualRelation(RangeVar *relation, List *tlist, bool replace,
             ColumnDef  *def = makeColumnDef(tle->resname,
                                             exprType((Node *) tle->expr),
                                             exprTypmod((Node *) tle->expr),
-                                            exprCollation((Node *) tle->expr));
+                                            exprCollation((Node *) tle->expr),
+                                            -1 /* detect array-ness */ );

             /*
              * It's possible that the column is of a collatable type but the
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index 007612563c..aad0e78e3d 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -512,15 +512,29 @@ makeTypeNameFromOid(Oid typeOid, int32 typmod)
  *    build a ColumnDef node to represent a simple column definition.
  *
  * Type and collation are specified by OID.
+ * Typmod must be given in numeric form.
+ *
+ * ndims can be positive to select that number of array dimensions,
+ * or 0 if it's not an array, or -1 for this function to detect whether
+ * it's an array type.  (In that case we will declare the column as having a
+ * single array dimension.  This might not accurately reproduce the source of,
+ * say, CREATE TABLE AS; but we don't have the information to do better.)
+ *
  * Other properties are all basic to start with.
  */
 ColumnDef *
-makeColumnDef(const char *colname, Oid typeOid, int32 typmod, Oid collOid)
+makeColumnDef(const char *colname, Oid typeOid, int32 typmod,
+              Oid collOid, int ndims)
 {
     ColumnDef  *n = makeNode(ColumnDef);

     n->colname = pstrdup(colname);
     n->typeName = makeTypeNameFromOid(typeOid, typmod);
+    if (ndims < 0)
+        ndims = OidIsValid(get_element_type(typeOid)) ? 1 : 0;
+    while (ndims-- > 0)
+        n->typeName->arrayBounds = lappend(n->typeName->arrayBounds,
+                                           makeInteger(-1));
     n->inhcount = 0;
     n->is_local = true;
     n->is_not_null = false;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index ca028d2a66..d03bc06f3e 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1198,7 +1198,8 @@ transformTableLikeClause(CreateStmtContext *cxt, TableLikeClause *table_like_cla
          * Create a new column definition
          */
         def = makeColumnDef(NameStr(attribute->attname), attribute->atttypid,
-                            attribute->atttypmod, attribute->attcollation);
+                            attribute->atttypmod, attribute->attcollation,
+                            attribute->attndims);

         /*
          * Add to column list
@@ -1637,7 +1638,8 @@ transformOfType(CreateStmtContext *cxt, TypeName *ofTypename)
             continue;

         n = makeColumnDef(NameStr(attr->attname), attr->atttypid,
-                          attr->atttypmod, attr->attcollation);
+                          attr->atttypmod, attr->attcollation,
+                          attr->attndims);
         n->is_from_type = true;

         cxt->columns = lappend(cxt->columns, n);
diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h
index 5473ce9a28..3747805f69 100644
--- a/src/include/nodes/makefuncs.h
+++ b/src/include/nodes/makefuncs.h
@@ -75,7 +75,8 @@ extern TypeName *makeTypeNameFromNameList(List *names);
 extern TypeName *makeTypeNameFromOid(Oid typeOid, int32 typmod);

 extern ColumnDef *makeColumnDef(const char *colname,
-                                Oid typeOid, int32 typmod, Oid collOid);
+                                Oid typeOid, int32 typmod, Oid collOid,
+                                int ndims);

 extern FuncExpr *makeFuncExpr(Oid funcid, Oid rettype, List *args,
                               Oid funccollid, Oid inputcollid, CoercionForm fformat);

Michael Paquier <michael@paquier.xyz> writes:
> Small question on this one: would it be worth adding a check in
> sanity_check.sql for bumpy values?

Yeah, I didn't bother with a regression test in this draft, but
we should likely have one.  Or was your point something different?

            regards, tom lane



Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Bruce Momjian
Дата:
On Sun, Jan 19, 2025 at 06:47:14PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Using the queries in that URL, I see:
> 
> >     CREATE TABLE test (data integer, data_array integer[5][5]);
> >     CREATE TABLE test2 (LIKE test);
> >     CREATE TABLE test3 AS SELECT * FROM test;
> >     SELECT relname, attndims
> >     FROM pg_class JOIN pg_attribute ON (pg_attribute.attrelid = pg_class.oid)
> >     WHERE attname = 'data_array';
> >      relname | attndims
> >     ---------+----------
> >      test    |        2
> > -->     test2   |        0
> > -->     test3   |        0
> 
> Yeah, that's not great.  We don't have the ability to extract a
> number-of-dimensions from a result column of a SELECT, but we could

I did write a patch in Novemer 2023 to pass the dimension to the layers
that needed it, but it was considered too much code compared to its
value:

    https://www.postgresql.org/message-id/ZVwI_ozT8z9MCnIZ@momjian.us

> at least take care to make attndims be 1 not 0 for an array type.
> And CREATE TABLE LIKE can easily do better.  See attached draft.
> (We could simplify it a little bit if we decide to store only 1 or 0
> in all cases.)
> 
> > Interestingly, if I dump and restore with:
> >     $ createdb test2; pg_dump test | sql test2
> > and run the query again I get:
> >      relname | attndims
> >     ---------+----------
> >      test    |        1
> >      test2   |        1
> >      test3   |        1
> 
> I looked at getting a better result here and decided that it didn't
> look very promising.  pg_dump uses format_type() to build the type
> name to put in CREATE TABLE, and that doesn't have access to attndims.

I ran your patch with my tests and it was now consistent in a zero/non-zero
test:

    CREATE TABLE test (data integer, data_array integer[5][5]);
    
    CREATE TABLE test2 (LIKE test);
    
    CREATE TABLE test3 AS SELECT * FROM test;
    
    SELECT relname, attndims
    FROM pg_class JOIN pg_attribute ON (pg_attribute.attrelid = pg_class.oid)
    WHERE attname = 'data_array';
     relname | attndims
    ---------+----------
     test    |        2
     test2   |        2
     test3   |        1

    $ createdb test2; pg_dump test | sql test2

    test2=>
    SELECT relname, attndims
    FROM pg_class JOIN pg_attribute ON (pg_attribute.attrelid = pg_class.oid)
    WHERE attname = 'data_array';
     relname | attndims
    ---------+----------
     test    |        1
     test2   |        1
     test3   |        1

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.





Bruce Momjian <bruce@momjian.us> writes:
> I did write a patch in Novemer 2023 to pass the dimension to the layers
> that needed it, but it was considered too much code compared to its
> value:
>     https://www.postgresql.org/message-id/ZVwI_ozT8z9MCnIZ@momjian.us

Ah, I'd forgotten that discussion.  It looks like the main differences
between your patch and mine are

(1) you did nothing about getting a nonzero attndims value for views
and CREATE TABLE AS cases.

(2) you found a way to get pg_dump to preserve attndims, which you
then also applied to psql's \d.

I'm not really thrilled with (2): it's messy and incomplete, and
I'm not entirely sure it won't break some uses of atttypname within
pg_dump.  I do think we should try to get to a place where attndims
is guaranteed nonzero for array-typed columns.

I checked the regression database after applying my patch, and see
that this test works:

regression=# select relname, relkind, attname, attndims, typname from pg_attribute a join pg_type t on (a.atttypid =
t.oid)join pg_class c on (a.attrelid = c.oid) where typsubscript = 'array_subscript_handler'::regproc and attndims = 0; 
 relname | relkind | attname | attndims | typname
---------+---------+---------+----------+---------
(0 rows)

but the reverse case not so much:

regression=# select relname, relkind, attname, attndims, typname from pg_attribute a join pg_type t on (a.atttypid =
t.oid)join pg_class c on (a.attrelid = c.oid) where typsubscript != 'array_subscript_handler'::regproc and attndims !=
0;
     relname      | relkind | attname | attndims | typname
------------------+---------+---------+----------+---------
 gin_test_idx     | i       | i       |        1 | int4
 botharrayidx     | i       | i       |        1 | int4
 botharrayidx     | i       | t       |        1 | text
 gin_relopts_test | i       | i       |        1 | int4
(4 rows)

I wonder if we should try to fix the GIN AM to avoid that.
The column being indexed is of an array type in these cases, but the
index entries aren't.  It seems inconsistent that it sets up the index
column's attndims and atttypid this way.

            regards, tom lane



I wrote:
> I wonder if we should try to fix the GIN AM to avoid that.
> The column being indexed is of an array type in these cases, but the
> index entries aren't.  It seems inconsistent that it sets up the index
> column's attndims and atttypid this way.

Ah, I see the problem: it's not GIN's fault, it's that index.c's
ConstructTupleDescriptor is very sloppy about setting attndims
in some places and not others.  Fortunately it's quite cheap to
fix that, since the places where this is missed already have their
hands on the pg_type entry for the index column's type.  See v2
attached (now with a regression test).

            regards, tom lane

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 7377912b41..c2efcb68d2 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -382,6 +382,7 @@ ConstructTupleDescriptor(Relation heapRelation,
             to->atttypid = keyType;
             to->attlen = typeTup->typlen;
             to->atttypmod = exprTypmod(indexkey);
+            to->attndims = IsTrueArrayType(typeTup) ? 1 : 0;
             to->attbyval = typeTup->typbyval;
             to->attalign = typeTup->typalign;
             to->attstorage = typeTup->typstorage;
@@ -467,6 +468,7 @@ ConstructTupleDescriptor(Relation heapRelation,

             to->atttypid = keyType;
             to->atttypmod = -1;
+            to->attndims = IsTrueArrayType(typeTup) ? 1 : 0;
             to->attlen = typeTup->typlen;
             to->attbyval = typeTup->typbyval;
             to->attalign = typeTup->typalign;
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 23cecd99c9..13fb6acfcf 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -183,7 +183,8 @@ create_ctas_nodata(List *tlist, IntoClause *into)
             col = makeColumnDef(colname,
                                 exprType((Node *) tle->expr),
                                 exprTypmod((Node *) tle->expr),
-                                exprCollation((Node *) tle->expr));
+                                exprCollation((Node *) tle->expr),
+                                -1 /* detect array-ness */ );

             /*
              * It's possible that the column is of a collatable type but the
@@ -492,10 +493,17 @@ intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
         else
             colname = NameStr(attribute->attname);

+        /*
+         * Note: we pass ndims as -1 not attribute->attndims because (a) the
+         * tupledesc we are given may not have accurate attndims, and (b) it
+         * seems best for this path to give results matching
+         * create_ctas_nodata.
+         */
         col = makeColumnDef(colname,
                             attribute->atttypid,
                             attribute->atttypmod,
-                            attribute->attcollation);
+                            attribute->attcollation,
+                            -1 /* detect array-ness */ );

         /*
          * It's possible that the column is of a collatable type but the
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index b13ee2b745..d5ca4ed5cb 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -178,15 +178,18 @@ DefineSequence(ParseState *pstate, CreateSeqStmt *seq)
         switch (i)
         {
             case SEQ_COL_LASTVAL:
-                coldef = makeColumnDef("last_value", INT8OID, -1, InvalidOid);
+                coldef = makeColumnDef("last_value", INT8OID, -1,
+                                       InvalidOid, 0);
                 value[i - 1] = Int64GetDatumFast(seqdataform.last_value);
                 break;
             case SEQ_COL_LOG:
-                coldef = makeColumnDef("log_cnt", INT8OID, -1, InvalidOid);
+                coldef = makeColumnDef("log_cnt", INT8OID, -1,
+                                       InvalidOid, 0);
                 value[i - 1] = Int64GetDatum((int64) 0);
                 break;
             case SEQ_COL_CALLED:
-                coldef = makeColumnDef("is_called", BOOLOID, -1, InvalidOid);
+                coldef = makeColumnDef("is_called", BOOLOID, -1,
+                                       InvalidOid, 0);
                 value[i - 1] = BoolGetDatum(false);
                 break;
         }
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d8f0a99ad9..c0b1de326c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -2741,7 +2741,9 @@ MergeAttributes(List *columns, const List *supers, char relpersistence,
              * Create new column definition
              */
             newdef = makeColumnDef(attributeName, attribute->atttypid,
-                                   attribute->atttypmod, attribute->attcollation);
+                                   attribute->atttypmod,
+                                   attribute->attcollation,
+                                   attribute->attndims);
             newdef->storage = attribute->attstorage;
             newdef->generated = attribute->attgenerated;
             if (CompressionMethodIsValid(attribute->attcompression))
diff --git a/src/backend/commands/view.c b/src/backend/commands/view.c
index 6f0301555e..cd0919a935 100644
--- a/src/backend/commands/view.c
+++ b/src/backend/commands/view.c
@@ -64,7 +64,8 @@ DefineVirtualRelation(RangeVar *relation, List *tlist, bool replace,
             ColumnDef  *def = makeColumnDef(tle->resname,
                                             exprType((Node *) tle->expr),
                                             exprTypmod((Node *) tle->expr),
-                                            exprCollation((Node *) tle->expr));
+                                            exprCollation((Node *) tle->expr),
+                                            -1 /* detect array-ness */ );

             /*
              * It's possible that the column is of a collatable type but the
diff --git a/src/backend/nodes/makefuncs.c b/src/backend/nodes/makefuncs.c
index 007612563c..aad0e78e3d 100644
--- a/src/backend/nodes/makefuncs.c
+++ b/src/backend/nodes/makefuncs.c
@@ -512,15 +512,29 @@ makeTypeNameFromOid(Oid typeOid, int32 typmod)
  *    build a ColumnDef node to represent a simple column definition.
  *
  * Type and collation are specified by OID.
+ * Typmod must be given in numeric form.
+ *
+ * ndims can be positive to select that number of array dimensions,
+ * or 0 if it's not an array, or -1 for this function to detect whether
+ * it's an array type.  (In that case we will declare the column as having a
+ * single array dimension.  This might not accurately reproduce the source of,
+ * say, CREATE TABLE AS; but we don't have the information to do better.)
+ *
  * Other properties are all basic to start with.
  */
 ColumnDef *
-makeColumnDef(const char *colname, Oid typeOid, int32 typmod, Oid collOid)
+makeColumnDef(const char *colname, Oid typeOid, int32 typmod,
+              Oid collOid, int ndims)
 {
     ColumnDef  *n = makeNode(ColumnDef);

     n->colname = pstrdup(colname);
     n->typeName = makeTypeNameFromOid(typeOid, typmod);
+    if (ndims < 0)
+        ndims = OidIsValid(get_element_type(typeOid)) ? 1 : 0;
+    while (ndims-- > 0)
+        n->typeName->arrayBounds = lappend(n->typeName->arrayBounds,
+                                           makeInteger(-1));
     n->inhcount = 0;
     n->is_local = true;
     n->is_not_null = false;
diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c
index ca028d2a66..d03bc06f3e 100644
--- a/src/backend/parser/parse_utilcmd.c
+++ b/src/backend/parser/parse_utilcmd.c
@@ -1198,7 +1198,8 @@ transformTableLikeClause(CreateStmtContext *cxt, TableLikeClause *table_like_cla
          * Create a new column definition
          */
         def = makeColumnDef(NameStr(attribute->attname), attribute->atttypid,
-                            attribute->atttypmod, attribute->attcollation);
+                            attribute->atttypmod, attribute->attcollation,
+                            attribute->attndims);

         /*
          * Add to column list
@@ -1637,7 +1638,8 @@ transformOfType(CreateStmtContext *cxt, TypeName *ofTypename)
             continue;

         n = makeColumnDef(NameStr(attr->attname), attr->atttypid,
-                          attr->atttypmod, attr->attcollation);
+                          attr->atttypmod, attr->attcollation,
+                          attr->attndims);
         n->is_from_type = true;

         cxt->columns = lappend(cxt->columns, n);
diff --git a/src/include/nodes/makefuncs.h b/src/include/nodes/makefuncs.h
index 5473ce9a28..3747805f69 100644
--- a/src/include/nodes/makefuncs.h
+++ b/src/include/nodes/makefuncs.h
@@ -75,7 +75,8 @@ extern TypeName *makeTypeNameFromNameList(List *names);
 extern TypeName *makeTypeNameFromOid(Oid typeOid, int32 typmod);

 extern ColumnDef *makeColumnDef(const char *colname,
-                                Oid typeOid, int32 typmod, Oid collOid);
+                                Oid typeOid, int32 typmod, Oid collOid,
+                                int ndims);

 extern FuncExpr *makeFuncExpr(Oid funcid, Oid rettype, List *args,
                               Oid funccollid, Oid inputcollid, CoercionForm fformat);
diff --git a/src/test/regress/expected/create_table_like.out b/src/test/regress/expected/create_table_like.out
index e061389135..798448c82b 100644
--- a/src/test/regress/expected/create_table_like.out
+++ b/src/test/regress/expected/create_table_like.out
@@ -558,8 +558,54 @@ CREATE TABLE ctlt11 (LIKE ctlv1);
 CREATE TABLE ctlt11a (LIKE ctlv1 INCLUDING ALL);
 CREATE TYPE ctlty1 AS (a int, b text);
 CREATE TABLE ctlt12 (LIKE ctlty1);
+/* Check that LIKE propagates attndims correctly */
+CREATE TABLE ctlarr1 (a int, b int[], c int[][]);
+CREATE TABLE ctlt13 (LIKE ctlarr1);
+SELECT attname, atttypid::regtype, attndims
+  FROM pg_attribute
+ WHERE attrelid = 'ctlarr1'::regclass AND attnum > 0
+ORDER BY attnum;
+ attname | atttypid  | attndims
+---------+-----------+----------
+ a       | integer   |        0
+ b       | integer[] |        1
+ c       | integer[] |        2
+(3 rows)
+
+SELECT attname, atttypid::regtype, attndims
+  FROM pg_attribute
+ WHERE attrelid = 'ctlt13'::regclass AND attnum > 0
+ORDER BY attnum;
+ attname | atttypid  | attndims
+---------+-----------+----------
+ a       | integer   |        0
+ b       | integer[] |        1
+ c       | integer[] |        2
+(3 rows)
+
+/*
+ * While we're here, check attndims consistency globally.
+ * Since this runs relatively late in the regression order, this will
+ * catch problems in system views and many types of indexes.
+ */
+SELECT relname, attname, attndims, typname
+  FROM pg_attribute a JOIN pg_type t ON (a.atttypid = t.oid)
+       JOIN pg_class c ON (a.attrelid = c.oid)
+ WHERE typsubscript = 'array_subscript_handler'::regproc AND attndims = 0;
+ relname | attname | attndims | typname
+---------+---------+----------+---------
+(0 rows)
+
+SELECT relname, attname, attndims, typname
+  FROM pg_attribute a JOIN pg_type t ON (a.atttypid = t.oid)
+       JOIN pg_class c ON (a.attrelid = c.oid)
+ WHERE typsubscript != 'array_subscript_handler'::regproc AND attndims != 0;
+ relname | attname | attndims | typname
+---------+---------+----------+---------
+(0 rows)
+
 DROP SEQUENCE ctlseq1;
 DROP TYPE ctlty1;
 DROP VIEW ctlv1;
-DROP TABLE IF EXISTS ctlt4, ctlt10, ctlt11, ctlt11a, ctlt12;
+DROP TABLE IF EXISTS ctlt4, ctlt10, ctlt11, ctlt11a, ctlt12, ctlarr1, ctlt13;
 NOTICE:  table "ctlt10" does not exist, skipping
diff --git a/src/test/regress/sql/create_table_like.sql b/src/test/regress/sql/create_table_like.sql
index a41f8b83d7..6432ebb0e5 100644
--- a/src/test/regress/sql/create_table_like.sql
+++ b/src/test/regress/sql/create_table_like.sql
@@ -221,7 +221,37 @@ CREATE TABLE ctlt11a (LIKE ctlv1 INCLUDING ALL);
 CREATE TYPE ctlty1 AS (a int, b text);
 CREATE TABLE ctlt12 (LIKE ctlty1);

+
+/* Check that LIKE propagates attndims correctly */
+
+CREATE TABLE ctlarr1 (a int, b int[], c int[][]);
+CREATE TABLE ctlt13 (LIKE ctlarr1);
+SELECT attname, atttypid::regtype, attndims
+  FROM pg_attribute
+ WHERE attrelid = 'ctlarr1'::regclass AND attnum > 0
+ORDER BY attnum;
+SELECT attname, atttypid::regtype, attndims
+  FROM pg_attribute
+ WHERE attrelid = 'ctlt13'::regclass AND attnum > 0
+ORDER BY attnum;
+
+/*
+ * While we're here, check attndims consistency globally.
+ * Since this runs relatively late in the regression order, this will
+ * catch problems in system views and many types of indexes.
+ */
+SELECT relname, attname, attndims, typname
+  FROM pg_attribute a JOIN pg_type t ON (a.atttypid = t.oid)
+       JOIN pg_class c ON (a.attrelid = c.oid)
+ WHERE typsubscript = 'array_subscript_handler'::regproc AND attndims = 0;
+
+SELECT relname, attname, attndims, typname
+  FROM pg_attribute a JOIN pg_type t ON (a.atttypid = t.oid)
+       JOIN pg_class c ON (a.attrelid = c.oid)
+ WHERE typsubscript != 'array_subscript_handler'::regproc AND attndims != 0;
+
+
 DROP SEQUENCE ctlseq1;
 DROP TYPE ctlty1;
 DROP VIEW ctlv1;
-DROP TABLE IF EXISTS ctlt4, ctlt10, ctlt11, ctlt11a, ctlt12;
+DROP TABLE IF EXISTS ctlt4, ctlt10, ctlt11, ctlt11a, ctlt12, ctlarr1, ctlt13;

Re: attndims, typndims still not enforced, but make the value within a sane threshold

От
Bruce Momjian
Дата:
On Tue, Jan 21, 2025 at 03:23:31PM -0500, Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > I did write a patch in Novemer 2023 to pass the dimension to the layers
> > that needed it, but it was considered too much code compared to its
> > value:
> >     https://www.postgresql.org/message-id/ZVwI_ozT8z9MCnIZ@momjian.us
> 
> Ah, I'd forgotten that discussion.  It looks like the main differences
> between your patch and mine are
> 
> (1) you did nothing about getting a nonzero attndims value for views
> and CREATE TABLE AS cases.
> 
> (2) you found a way to get pg_dump to preserve attndims, which you
> then also applied to psql's \d.

Yes, I was more focused on perserving the value, rather than computing
zero/non-zero from the data type.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.