Обсуждение: PostgreSQL domains and NOT NULL constraint

Поиск
Список
Период
Сортировка

PostgreSQL domains and NOT NULL constraint

От
Erki Eessaar
Дата:
Hello

PostgreSQL's CREATE DOMAIN documentation (section Notes) describes a way how one can add NULL's to a column that has a domain with the NOT NULL constraint.

To me it seems very strange and amounts to a bug because it defeats the purpose of domains (to be a reusable assets) and constraints (to avoid any bypassing of these).

Oracle 23c added the support of domains (https://docs.oracle.com/en/database/oracle/oracle-database/23/sqlrf/create-domain.html). I tested the same scenario both in PostgreSQL and Oracle (https://www.oracle.com/database/free/) and found out that in these situations Oracle does not allow NULL's to be added to the column. I do not know as to whether the behaviour that is implemented in PostgreSQL is specified by the standard. However, if it is not the case, then how it could be that Oracle can but PostgreSQL cannot.

Best regards
Erki Eessaar

The scenario that I tested both in PostgreSQL (16) and Oracle (23c).
***********************************
/*PostgreSQL 16*/

CREATE DOMAIN d_name VARCHAR(50) NOT NULL;

CREATE TABLE Product_state_type (product_state_type_code SMALLINT NOT NULL,
name d_name,
CONSTRAINT pk_product_state_type PRIMARY KEY (product_state_type_code),
CONSTRAINT ak_product_state_type_name UNIQUE (name));

CREATE TABLE Product (product_code INTEGER NOT NULL,
name d_name,
product_state_type_code SMALLINT NOT NULL,
CONSTRAINT pk_product PRIMARY KEY (product_code),
CONSTRAINT fk_product_product_state_type FOREIGN KEY (product_state_type_code)
REFERENCES Product_state_type(product_state_type_code) ON UPDATE CASCADE);

INSERT INTO Product_state_type (product_state_type_code, name)
VALUES (1, (SELECT name FROM Product_state_type WHERE FALSE));
/*Insertion succeeds, name is NULL!*/

INSERT INTO Product (product_code, name, product_state_type_code)
SELECT 1 AS product_code, Product.name, 1 AS product_state_type_code
FROM Product_state_type LEFT JOIN Product USING (product_state_type_code);
/*Insertion succeeds, name is NULL!*/

/*Oracle 23c*/

CREATE DOMAIN d_name AS VARCHAR2(50) NOT NULL;

CREATE TABLE Product_state_type (product_state_type_code NUMBER(4) NOT NULL,
name d_name,
CONSTRAINT pk_product_state_type PRIMARY KEY (product_state_type_code),
CONSTRAINT ak_product_state_type_name UNIQUE (name));

CREATE TABLE Product (product_code NUMBER(8) NOT NULL,
name d_name,
product_state_type_code NUMBER(4) NOT NULL,
CONSTRAINT pk_product PRIMARY KEY (product_code),
CONSTRAINT fk_product_product_state_type FOREIGN KEY (product_state_type_code)
REFERENCES Product_state_type(product_state_type_code));


INSERT INTO Product_state_type (product_state_type_code, name)
VALUES (1, (SELECT name FROM Product_state_type WHERE FALSE));
/*Fails.
Error report -
SQL Error: ORA-01400: cannot insert NULL into
("SYSTEM"."PRODUCT_STATE_TYPE"."NAME")
Help: https://docs.oracle.com/error-help/db/ora-01400/
01400. 00000 -  "cannot insert NULL into (%s)"
*Cause:    An attempt was made to insert NULL into previously listed objects.
*Action:   These objects cannot accept NULL values.*/

INSERT INTO Product_state_type (product_state_type_code, name)
VALUES (1, 'Active');

INSERT INTO Product (product_code, name, product_state_type_code)
SELECT 1 AS product_code, Product.name, 1 AS product_state_type_code
FROM Product_state_type LEFT JOIN Product USING (product_state_type_code);
/*Fails.
SQL Error: ORA-01400: cannot insert NULL into
("SYSTEM"."PRODUCT"."NAME")
Help: https://docs.oracle.com/error-help/db/ora-01400/
01400. 00000 -  "cannot insert NULL into (%s)"
*Cause:    An attempt was made to insert NULL into previously listed objects.
*Action:   These objects cannot accept NULL values.*/

Re: PostgreSQL domains and NOT NULL constraint

От
Tom Lane
Дата:
Erki Eessaar <erki.eessaar@taltech.ee> writes:
> PostgreSQL's CREATE DOMAIN documentation (section Notes) describes a way how one can add NULL's to a column that has
adomain with the NOT NULL constraint. 
> https://www.postgresql.org/docs/current/sql-createdomain.html
> To me it seems very strange and amounts to a bug because it defeats the purpose of domains (to be a reusable assets)
andconstraints (to avoid any bypassing of these). 

I doubt we'd consider doing anything about that.  The whole business
of domains with NOT NULL constraints is arguably a defect of the SQL
standard, because there are multiple ways to produce a value that
is NULL and yet must be considered to be of the domain type.
The subselect-with-no-output case that you show isn't even the most
common one; I'd say that outer joins where there are domain columns
on the nullable side are the biggest problem.

There's been some discussion of treating the output of such a join,
subselect, etc as being of the domain's base type not the domain
proper.  That'd solve this particular issue since then we'd decide
we have to cast the base type back up to the domain type (and hence
check its constraints) before inserting the row.  But that choice
just moves the surprise factor somewhere else, in that queries that
used to produce one data type now produce another one.  There are
applications that this would break.  Moreover, I do not think there's
any justification for it in the SQL spec.

Our general opinion about this is what is stated in the NOTES
section of our CREATE DOMAIN reference page [1]:

    Best practice therefore is to design a domain's constraints so that a
    null value is allowed, and then to apply column NOT NULL constraints
    to columns of the domain type as needed, rather than directly to the
    domain type.

            regards, tom lane

[1] https://www.postgresql.org/docs/current/sql-createdomain.html



Re: PostgreSQL domains and NOT NULL constraint

От
Vik Fearing
Дата:
On 10/12/23 15:54, Tom Lane wrote:
> Erki Eessaar <erki.eessaar@taltech.ee> writes:
>> PostgreSQL's CREATE DOMAIN documentation (section Notes) describes a way how one can add NULL's to a column that has
adomain with the NOT NULL constraint.
 
>> https://www.postgresql.org/docs/current/sql-createdomain.html
>> To me it seems very strange and amounts to a bug because it defeats the purpose of domains (to be a reusable assets)
andconstraints (to avoid any bypassing of these).
 
> 
> I doubt we'd consider doing anything about that.  The whole business
> of domains with NOT NULL constraints is arguably a defect of the SQL
> standard, because there are multiple ways to produce a value that
> is NULL and yet must be considered to be of the domain type.
> The subselect-with-no-output case that you show isn't even the most
> common one; I'd say that outer joins where there are domain columns
> on the nullable side are the biggest problem.
> 
> There's been some discussion of treating the output of such a join,
> subselect, etc as being of the domain's base type not the domain
> proper.  That'd solve this particular issue since then we'd decide
> we have to cast the base type back up to the domain type (and hence
> check its constraints) before inserting the row.  But that choice
> just moves the surprise factor somewhere else, in that queries that
> used to produce one data type now produce another one.  There are
> applications that this would break.  Moreover, I do not think there's
> any justification for it in the SQL spec.


I do not believe this is a defect of the SQL standard at all. 
SQL:2023-2 Section 4.14 "Domains" clearly states "The purpose of a 
domain is to constrain the set of valid values that can be stored in a 
column of a base table by various operations."

That seems very clear to me that *storing* a value in a base table must 
respect the domain's constraints, even if *operations* on those values 
might not respect all of the domain's constraints.

Whether or not it is practical to implement that is a different story, 
but allowing the null value to be stored in a column of a base table 
whose domain specifies NOT NULL is frankly a bug.
-- 
Vik Fearing




Re: PostgreSQL domains and NOT NULL constraint

От
Tom Lane
Дата:
Vik Fearing <vik@postgresfriends.org> writes:
> On 10/12/23 15:54, Tom Lane wrote:
>> There's been some discussion of treating the output of such a join,
>> subselect, etc as being of the domain's base type not the domain
>> proper.  That'd solve this particular issue since then we'd decide
>> we have to cast the base type back up to the domain type (and hence
>> check its constraints) before inserting the row.  But that choice
>> just moves the surprise factor somewhere else, in that queries that
>> used to produce one data type now produce another one.  There are
>> applications that this would break.  Moreover, I do not think there's
>> any justification for it in the SQL spec.

> I do not believe this is a defect of the SQL standard at all.
> SQL:2023-2 Section 4.14 "Domains" clearly states "The purpose of a
> domain is to constrain the set of valid values that can be stored in a
> column of a base table by various operations."

So I wonder what is the standard's interpretation of

regression=# create domain dpos as integer not null check (value > 0);
CREATE DOMAIN
regression=# create table t1 (x int, d dpos);
CREATE TABLE
regression=# create view v1 as select ty.d from t1 tx left join t1 ty using (x);
CREATE VIEW
regression=# \d+ v1
                            View "public.v1"
 Column | Type | Collation | Nullable | Default | Storage | Description
--------+------+-----------+----------+---------+---------+-------------
 d      | dpos |           |          |         | plain   |
View definition:
 SELECT ty.d
   FROM t1 tx
     LEFT JOIN t1 ty USING (x);

If we are incorrect in ascribing the type "dpos" to v1.d, where
in the spec contradicts that?  (Or in other words, 4.14 might lay
out some goals for the feature, but that's just empty words if
it's not supported by accurate details in other places.)

            regards, tom lane



Re: PostgreSQL domains and NOT NULL constraint

От
Vik Fearing
Дата:
On 10/13/23 02:44, Tom Lane wrote:
> Vik Fearing <vik@postgresfriends.org> writes:
>> On 10/12/23 15:54, Tom Lane wrote:
>>> There's been some discussion of treating the output of such a join,
>>> subselect, etc as being of the domain's base type not the domain
>>> proper.  That'd solve this particular issue since then we'd decide
>>> we have to cast the base type back up to the domain type (and hence
>>> check its constraints) before inserting the row.  But that choice
>>> just moves the surprise factor somewhere else, in that queries that
>>> used to produce one data type now produce another one.  There are
>>> applications that this would break.  Moreover, I do not think there's
>>> any justification for it in the SQL spec.
> 
>> I do not believe this is a defect of the SQL standard at all.
>> SQL:2023-2 Section 4.14 "Domains" clearly states "The purpose of a
>> domain is to constrain the set of valid values that can be stored in a
>> column of a base table by various operations."
> 
> So I wonder what is the standard's interpretation of
> 
> regression=# create domain dpos as integer not null check (value > 0);
> CREATE DOMAIN
> regression=# create table t1 (x int, d dpos);
> CREATE TABLE
> regression=# create view v1 as select ty.d from t1 tx left join t1 ty using (x);
> CREATE VIEW
> regression=# \d+ v1
>                              View "public.v1"
>   Column | Type | Collation | Nullable | Default | Storage | Description
> --------+------+-----------+----------+---------+---------+-------------
>   d      | dpos |           |          |         | plain   |
> View definition:
>   SELECT ty.d
>     FROM t1 tx
>       LEFT JOIN t1 ty USING (x);
> 
> If we are incorrect in ascribing the type "dpos" to v1.d, where
> in the spec contradicts that?  (Or in other words, 4.14 might lay
> out some goals for the feature, but that's just empty words if
> it's not supported by accurate details in other places.)
Objection, Your Honor: Relevance.

Regardless of what the spec may or may not say about v1.d, it still 
remains that nulls should not be allowed in a *base table* if the domain 
says nulls are not allowed.  Not mentioned in this thread but the 
constraints are also applied when CASTing to the domain.

Now, to answer your straw man, this might be helpful:

SQL:2023-2 Section 11.4 <column definition> Syntax Rule 9, "If the 
descriptor of D includes any domain constraint descriptors, then T shall 
be a persistent base table.".  Your v1 is not that and therefore 
arguably illegal.

As you know, I am more than happy to (try to) amend the spec where 
needed, but Erki's complaint of a null value being allowed in a base 
table is clearly a bug in our implementation regardless of what we do 
with views.
-- 
Vik Fearing




Re: PostgreSQL domains and NOT NULL constraint

От
Tom Lane
Дата:
Vik Fearing <vik@postgresfriends.org> writes:
> Regardless of what the spec may or may not say about v1.d, it still 
> remains that nulls should not be allowed in a *base table* if the domain 
> says nulls are not allowed.  Not mentioned in this thread but the 
> constraints are also applied when CASTing to the domain.

Hmph.  The really basic problem here, I think, is that the spec
wants to claim that a domain is a data type, but then it backs
off and limits where the domain's constraints need to hold.
That's fundamentally inconsistent.  It's like claiming that
'foobarbaz' is a valid value of type numeric as long as it's
only in flight within a query and you haven't tried to store it
into a table.

Practical problems with this include:

* If a function declares its argument as being of a domain type,
can it expect that the passed value obeys the constraints?

* If a function declares its result as being of a domain type,
is it required to return a result that obeys the constraints?
(This has particular force for RETURNS NULL ON NULL INPUT
functions, for which we just automatically return NULL given
a NULL input without any consideration of whether the result
type nominally prohibits that.)

* If a plpgsql function has a variable that is declared to be of
domain type, do we enforce the domain's constraints when assigning?

* If a composite type has a column of a domain type, do we enforce
the domain's constraints when assigning or casting to that?

AFAICS, the spec's position leaves all of these as judgment calls,
or else you might claim that none of the above cases are even allowed
to be declared per spec.  I don't find either of those satisfactory,
so I reiterate my position that the committee hasn't thought this
through.

> As you know, I am more than happy to (try to) amend the spec where 
> needed, but Erki's complaint of a null value being allowed in a base 
> table is clearly a bug in our implementation regardless of what we do 
> with views.

I agree it's not a good behavior, but I still say it's traceable
to schizophenia in the spec.  If the result of a sub-select is
nominally of a domain type, we should not have to recheck the
domain constraints in order to assign it to a domain-typed target.
If it's not nominally of a domain type, please cite chapter and
verse that says it isn't.

            regards, tom lane



Re: PostgreSQL domains and NOT NULL constraint

От
Erki Eessaar
Дата:
Hello

Equaling a domain with a type is really confusing because why, for instance, in this case the following is possible without defining any additional operators.

CREATE DOMAIN d_name VARCHAR(50) NOT NULL;
CREATE DOMAIN d_description VARCHAR(1000) NOT NULL;
CREATE TABLE x(name d_name, description d_description);
SELECT *
FROM x
WHERE name=description;

Isn't it so that domains are not types and for this reason there are separate CREATE DOMAIN and CREATE TYPE statements?!

In my opinion the Notes section of CREATE DOMAIN documentation should offer better examples. The two examples that I provided in my demonstration seemed very far fetched and artificial. Frankly, I have difficulties in imagining why someone would like to write statements like that in a production environment and how the proper enforcement of NOT NULL constraints of domains could break things. 

Lets say I have a column that I have declared mandatory by using a domain, but somehow I have added NULLs to the column, and if it is not possible any more, then things break down.

If I want to permit NULLs, then ALTER DOMAIN d DROP NOT NULL; will fix it with one stroke. If I do not want to permit NULLs but I have registered NULLs, then this is a data quality issue that has to be addressed.

Currently there is a feature (NOT NULL of domain) that the documentation explicitly suggests not to use. Isn't it in this case better to remove this feature completely?! If this would break something, then it would mean that systems actually rely on this constraint.

Best regards
Erki Eessaar


From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Friday, October 13, 2023 08:37
To: Vik Fearing <vik@postgresfriends.org>
Cc: Erki Eessaar <erki.eessaar@taltech.ee>; pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
Subject: Re: PostgreSQL domains and NOT NULL constraint
 
Vik Fearing <vik@postgresfriends.org> writes:
> Regardless of what the spec may or may not say about v1.d, it still
> remains that nulls should not be allowed in a *base table* if the domain
> says nulls are not allowed.  Not mentioned in this thread but the
> constraints are also applied when CASTing to the domain.

Hmph.  The really basic problem here, I think, is that the spec
wants to claim that a domain is a data type, but then it backs
off and limits where the domain's constraints need to hold.
That's fundamentally inconsistent.  It's like claiming that
'foobarbaz' is a valid value of type numeric as long as it's
only in flight within a query and you haven't tried to store it
into a table.

Practical problems with this include:

* If a function declares its argument as being of a domain type,
can it expect that the passed value obeys the constraints?

* If a function declares its result as being of a domain type,
is it required to return a result that obeys the constraints?
(This has particular force for RETURNS NULL ON NULL INPUT
functions, for which we just automatically return NULL given
a NULL input without any consideration of whether the result
type nominally prohibits that.)

* If a plpgsql function has a variable that is declared to be of
domain type, do we enforce the domain's constraints when assigning?

* If a composite type has a column of a domain type, do we enforce
the domain's constraints when assigning or casting to that?

AFAICS, the spec's position leaves all of these as judgment calls,
or else you might claim that none of the above cases are even allowed
to be declared per spec.  I don't find either of those satisfactory,
so I reiterate my position that the committee hasn't thought this
through.

> As you know, I am more than happy to (try to) amend the spec where
> needed, but Erki's complaint of a null value being allowed in a base
> table is clearly a bug in our implementation regardless of what we do
> with views.

I agree it's not a good behavior, but I still say it's traceable
to schizophenia in the spec.  If the result of a sub-select is
nominally of a domain type, we should not have to recheck the
domain constraints in order to assign it to a domain-typed target.
If it's not nominally of a domain type, please cite chapter and
verse that says it isn't.

                        regards, tom lane

Re: PostgreSQL domains and NOT NULL constraint

От
Vik Fearing
Дата:
On 10/13/23 06:37, Tom Lane wrote:
> Vik Fearing <vik@postgresfriends.org> writes:
>> Regardless of what the spec may or may not say about v1.d, it still
>> remains that nulls should not be allowed in a *base table* if the domain
>> says nulls are not allowed.  Not mentioned in this thread but the
>> constraints are also applied when CASTing to the domain.
> 
> Hmph.  The really basic problem here, I think, is that the spec
> wants to claim that a domain is a data type, but then it backs
> off and limits where the domain's constraints need to hold.


I don't think that is an accurate depiction of domains.

First of all, I am not seeing where it says that a domain is a data 
type.  It allows domains to be used in some places where a data type is 
used, but that is not equivalent to a domain /being/ a data type.

Section 4.14 says, "A domain is a set of permissible values." and then 
goes on to say that that is a combination of a predefined type and zero 
or more search conditions.  It can also have a default value, but it 
does not seem relevant to talk about that in this discussion.

Section 4.25.4, "Domain constraints" has this to say (emphasis mine):

- A domain constraint is satisfied by SQL-data *if and only if*, for 
every *table* T that has a column named C based on that domain, the 
applicable <search condition> recorded in the appropriate domain 
constraint usage evaluates to True or Unknown.

- A domain constraint is satisfied by the result of a <cast 
specification> if and only if the specified template <search condition>, 
with each occurrence of the <general value specification> VALUE replaced 
by that result, evaluates to True or Unknown.

This tells me that the constraints should only be checked at those two 
points.

Secondly, why are you so concerned about outer join nulls here and not 
for any other column marked NOT NULL?


> That's fundamentally inconsistent.  It's like claiming that
> 'foobarbaz' is a valid value of type numeric as long as it's
> only in flight within a query and you haven't tried to store it
> into a table.


It's like claiming that null is a valid value of type numeric as long as 
it's only in flight within a query and you haven't tried to store it 
into a table with that column marked NOT NULL.


> Practical problems with this include:
> 
> * If a function declares its argument as being of a domain type,
> can it expect that the passed value obeys the constraints?
> 
> * If a function declares its result as being of a domain type,
> is it required to return a result that obeys the constraints?
> (This has particular force for RETURNS NULL ON NULL INPUT
> functions, for which we just automatically return NULL given
> a NULL input without any consideration of whether the result
> type nominally prohibits that.)
> 
> * If a plpgsql function has a variable that is declared to be of
> domain type, do we enforce the domain's constraints when assigning?


Routines are not allowed to have domains in their parameters or result 
types.

I am all for PostgreSQL expanding the spec wherever we can, but in the 
above cases we have to define things ourselves.


> * If a composite type has a column of a domain type, do we enforce
> the domain's constraints when assigning or casting to that?


I don't see that a composite type is able to have a member of a domain. 
As for what PostgreSQL should do in this case, my opinion is "yes".


> AFAICS, the spec's position leaves all of these as judgment calls,
> or else you might claim that none of the above cases are even allowed
> to be declared per spec.  I don't find either of those satisfactory,
> so I reiterate my position that the committee hasn't thought this
> through.


My claim is indeed that these cases are not allowed per-spec and 
therefore the spec doesn't *need* to think about them.  We do.


>> As you know, I am more than happy to (try to) amend the spec where
>> needed, but Erki's complaint of a null value being allowed in a base
>> table is clearly a bug in our implementation regardless of what we do
>> with views.
> 
> I agree it's not a good behavior, but I still say it's traceable
> to schizophenia in the spec.  If the result of a sub-select is
> nominally of a domain type, we should not have to recheck the
> domain constraints in order to assign it to a domain-typed target.


Well, yes, we should.

Allowing a null to be stored in a column where the user has specified 
NOT NULL, no matter how the user did that, is unacceptable and I am 
frankly surprised that you are defending it.


> If it's not nominally of a domain type, please cite chapter and
> verse that says it isn't.

I don't see anything for or against this, I just see that the domain 
constraints are only checked on storage or casting.

And therefore, I think with these definitions:

CREATE DOMAIN dom AS INTEGER CHECK (VALUE >= 0);
CREATE TABLE t (d dom);
INSERT INTO t (d) VALUES (1);

this should be valid according to the spec:

SELECT -d FROM t;

and this should error:

SELECT CAST(-d AS dom) FROM t;
-- 
Vik Fearing




Re: PostgreSQL domains and NOT NULL constraint

От
Erki Eessaar
Дата:
>I doubt we'd consider doing anything about that.  
>The whole business of domains with NOT NULL constraints 
>is arguably a defect of the SQL standard, because 
>there are multiple ways to produce a value that
>is NULL and yet must be considered to be of the domain type.

In my opinion it is inconsistent and illogical if a type sometimes contains a value and sometimes not.

CREATE DOMAIN d_int INTEGER NOT NULL;

All the following statements fail (and correctly so in my opinion).

SELECT (NULL)::d_int;
/*ERROR:  domain d_int does not allow null values*/

SELECT Cast(NULL AS d_int);
/*ERROR:  domain d_int does not allow null values*/

WITH val (v) AS (VALUES (1), (NULL))
SELECT Cast(v AS d_int) AS v
FROM Val;
/*ERROR:  domain d_int does not allow null values*/

In my opinion the confusion and related problems arise from the widespread practice of sometimes treating a domain as a type (which it is not) and sometimes treating NULL as  a value (which it is not).

Best regards
Erki Eessaar

Re: PostgreSQL domains and NOT NULL constraint

От
Vik Fearing
Дата:
On 10/13/23 06:37, Tom Lane wrote:
> If it's not nominally of a domain type, please cite chapter and
> verse that says it isn't.

Okay, I found it.


SQL:2023-2 6.7 <column reference>

Syntax Rules

5) Let C be the column that is referenced by CR. The declared type of CR is
    Case:
    a) If the column descriptor of C includes a data type, then that 
data type.

    b) Otherwise, the data type identified in the domain descriptor that 
describes the domain that is identified by the <domain name> that is 
included in the column descriptor of C.


So the domain should not be carried into a query expression (including 
views) and the data type should be the one specified in the domain.
-- 
Vik Fearing




Re: PostgreSQL domains and NOT NULL constraint

От
Tom Lane
Дата:
Vik Fearing <vik@postgresfriends.org> writes:
> On 10/13/23 06:37, Tom Lane wrote:
>> Hmph.  The really basic problem here, I think, is that the spec
>> wants to claim that a domain is a data type, but then it backs
>> off and limits where the domain's constraints need to hold.

> I don't think that is an accurate depiction of domains.
> First of all, I am not seeing where it says that a domain is a data 
> type.  It allows domains to be used in some places where a data type is 
> used, but that is not equivalent to a domain /being/ a data type.

Hmm, you are right.  This is something I'd never paid attention to
before, but they do seem to exclude domains from being the declared
type of any expression.  Most notably, not even a CAST to a domain
type produces the domain type.  Per SQL:2021 6.13 <cast specification>
syntax rules:

    1) Case:
        a) If a <domain name> is specified, then let TD be the data
        type of the specified domain.

        b) If a <data type> is specified, then let TD be the data type
        identified by <data type>. <data type> shall not contain a
        <collate clause>.

    2) The declared type of the result of the <cast specification> is TD.

Even more amusingly for our current purposes, CAST does not enforce
NOT NULL.  <cast specification> general rule 2:

    2) Case:
        a) If the <cast operand> specifies NULL, then the result of CS
        is the null value and no further General Rules of this
        Subclause are applied.

        b) If the <cast operand> specifies an <empty specification>,
        then the result of CS is an empty collection of declared type
        TD and no further General Rules of this Subclause are applied.

        c) If SV is the null value, then the result of CS is the null
        value and no further General Rules of this Subclause are
        applied.

So for a null value the spec never reaches GR 23 that says to apply
the domain's constraints.

This is already a sufficient intellectual muddle that I'm not sure
we want to follow it slavishly.  If not-null can be ignored here,
why not elsewhere?

But anyway, yeah, the spec's notion of a domain bears only passing
resemblance to what we've actually implemented.  I'm not really sure
that we want to switch, because AFAICS the spec's model doesn't
include any of these things:

* Domains over other domains

* Domains over arrays, composite types, etc

* Functions accepting or returning domain types

If we were to try to do something closer to what the spec has in mind,
how would we do it without ripping out a ton of functionality that
people have requested and come to depend on?

> Section 4.25.4, "Domain constraints" has this to say (emphasis mine):
>
> - A domain constraint is satisfied by SQL-data *if and only if*, for 
> every *table* T that has a column named C based on that domain, the 
> applicable <search condition> recorded in the appropriate domain 
> constraint usage evaluates to True or Unknown.

I think that isn't particularly relevant, because I believe that by
SQL-data they mean the static contents of a database, so of course
only table contents matter.  What we are concerned about is dynamic
behavior within queries and functions.

> Secondly, why are you so concerned about outer join nulls here and not 
> for any other column marked NOT NULL?

Primarily because that's an easy way for a column that was marked
NOT NULL to read out as NULL.

>> That's fundamentally inconsistent.  It's like claiming that
>> 'foobarbaz' is a valid value of type numeric as long as it's
>> only in flight within a query and you haven't tried to store it
>> into a table.

> It's like claiming that null is a valid value of type numeric as long as 
> it's only in flight within a query and you haven't tried to store it 
> into a table with that column marked NOT NULL.

And?  NULL *is* a valid value of type numeric, as well as all other
base types.

> Allowing a null to be stored in a column where the user has specified 
> NOT NULL, no matter how the user did that, is unacceptable and I am 
> frankly surprised that you are defending it.

What I'm trying to hold onto is the notion that a domain can
meaningfully be considered to be a data type (that is, that a value in
flight can be considered to be of a domain type).  We've been building
the system on that assumption for over twenty years now, and I think
it's pretty deeply ingrained.  I don't understand the consequences
of abandoning it, and I'm not convinced that the spec's model is
sufficiently intellectually rigorous that we can just say "oh, we'll
follow the spec instead of what we've been doing, and it'll be fine".

As a trivial example: our implementation assumes that enforcing a
domain's constraints is to be done by casting the base type value
to the domain type.  Per the above reading of <6.13>, this should
fail to reject nulls, so we'd have to understand and implement
checking of domain constraints in some other way.

Given the exception the spec makes for CAST, I wonder if we shouldn't
just say "NULL is a valid value of every domain type, as well as every
base type.  If you don't like it, too bad; write a separate NOT NULL
constraint for your table column."

            regards, tom lane



Re: PostgreSQL domains and NOT NULL constraint

От
Erki Eessaar
Дата:
Hello

Similarly, PostgreSQL does not enforce CHECK constraints of domains that try to enforce NOT NULL in the same situations where it does not enforce NOT NULL constraints - see example in the end.

Thus, in my base tables can be rows that violate domain NOT NULL and CHECK constraints. For me, it is not a "feature", it is a bug.

By the way, my small applications use domain NOT NULL constraints. This was the reason why I asked are there any other examples in addition to those that I provided that allow NULL's to NOT NULL columns.

Best regards
Erki Eessaar

****************************
DROP TABLE IF EXISTS Product;
DROP TABLE IF EXISTS Product_state_type;
DROP DOMAIN IF EXISTS d_name;

CREATE DOMAIN d_name VARCHAR(50)
CONSTRAINT chk_d_name CHECK (VALUE IS NOT NULL);

CREATE TABLE Product_state_type (product_state_type_code SMALLINT NOT NULL,
name d_name,
CONSTRAINT pk_product_state_type PRIMARY KEY (product_state_type_code),
CONSTRAINT ak_product_state_type_name UNIQUE (name));

CREATE TABLE Product (product_code INTEGER NOT NULL,
name d_name,
product_state_type_code SMALLINT NOT NULL,
CONSTRAINT pk_product PRIMARY KEY (product_code),
CONSTRAINT fk_product_product_state_type FOREIGN KEY (product_state_type_code)
REFERENCES Product_state_type(product_state_type_code) ON UPDATE CASCADE);

INSERT INTO Product_state_type (product_state_type_code, name)
VALUES (1, (SELECT name FROM Product_state_type WHERE FALSE));
/*Insertion succeeds, name is NULL!*/

INSERT INTO Product (product_code, name, product_state_type_code)
SELECT 1 AS product_code, Product.name, 1 AS product_state_type_code
FROM Product_state_type LEFT JOIN Product USING (product_state_type_code);
/*Insertion succeeds, name is NULL!*/

DROP TABLE IF EXISTS Product;
DROP TABLE IF EXISTS Product_state_type;
DROP DOMAIN IF EXISTS d_name;

CREATE DOMAIN d_name VARCHAR(50)
CONSTRAINT chk_d_name CHECK (coalesce(VALUE,'')<>'');

CREATE TABLE Product_state_type (product_state_type_code SMALLINT NOT NULL,
name d_name,
CONSTRAINT pk_product_state_type PRIMARY KEY (product_state_type_code),
CONSTRAINT ak_product_state_type_name UNIQUE (name));

CREATE TABLE Product (product_code INTEGER NOT NULL,
name d_name,
product_state_type_code SMALLINT NOT NULL,
CONSTRAINT pk_product PRIMARY KEY (product_code),
CONSTRAINT fk_product_product_state_type FOREIGN KEY (product_state_type_code)
REFERENCES Product_state_type(product_state_type_code) ON UPDATE CASCADE);

INSERT INTO Product_state_type (product_state_type_code, name)
VALUES (1, (SELECT name FROM Product_state_type WHERE FALSE));
/*Insertion succeeds, name is NULL!*/

INSERT INTO Product (product_code, name, product_state_type_code)
SELECT 1 AS product_code, Product.name, 1 AS product_state_type_code
FROM Product_state_type LEFT JOIN Product USING (product_state_type_code);
/*Insertion succeeds, name is NULL!*/


From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Saturday, October 14, 2023 19:09
To: Vik Fearing <vik@postgresfriends.org>
Cc: Erki Eessaar <erki.eessaar@taltech.ee>; pgsql-hackers@lists.postgresql.org <pgsql-hackers@lists.postgresql.org>
Subject: Re: PostgreSQL domains and NOT NULL constraint
 
Vik Fearing <vik@postgresfriends.org> writes:
> On 10/13/23 06:37, Tom Lane wrote:
>> Hmph.  The really basic problem here, I think, is that the spec
>> wants to claim that a domain is a data type, but then it backs
>> off and limits where the domain's constraints need to hold.

> I don't think that is an accurate depiction of domains.
> First of all, I am not seeing where it says that a domain is a data
> type.  It allows domains to be used in some places where a data type is
> used, but that is not equivalent to a domain /being/ a data type.

Hmm, you are right.  This is something I'd never paid attention to
before, but they do seem to exclude domains from being the declared
type of any expression.  Most notably, not even a CAST to a domain
type produces the domain type.  Per SQL:2021 6.13 <cast specification>
syntax rules:

    1) Case:
        a) If a <domain name> is specified, then let TD be the data
        type of the specified domain.

        b) If a <data type> is specified, then let TD be the data type
        identified by <data type>. <data type> shall not contain a
        <collate clause>.

    2) The declared type of the result of the <cast specification> is TD.

Even more amusingly for our current purposes, CAST does not enforce
NOT NULL.  <cast specification> general rule 2:

    2) Case:
        a) If the <cast operand> specifies NULL, then the result of CS
        is the null value and no further General Rules of this
        Subclause are applied.

        b) If the <cast operand> specifies an <empty specification>,
        then the result of CS is an empty collection of declared type
        TD and no further General Rules of this Subclause are applied.

        c) If SV is the null value, then the result of CS is the null
        value and no further General Rules of this Subclause are
        applied.

So for a null value the spec never reaches GR 23 that says to apply
the domain's constraints.

This is already a sufficient intellectual muddle that I'm not sure
we want to follow it slavishly.  If not-null can be ignored here,
why not elsewhere?

But anyway, yeah, the spec's notion of a domain bears only passing
resemblance to what we've actually implemented.  I'm not really sure
that we want to switch, because AFAICS the spec's model doesn't
include any of these things:

* Domains over other domains

* Domains over arrays, composite types, etc

* Functions accepting or returning domain types

If we were to try to do something closer to what the spec has in mind,
how would we do it without ripping out a ton of functionality that
people have requested and come to depend on?

> Section 4.25.4, "Domain constraints" has this to say (emphasis mine):
>
> - A domain constraint is satisfied by SQL-data *if and only if*, for
> every *table* T that has a column named C based on that domain, the
> applicable <search condition> recorded in the appropriate domain
> constraint usage evaluates to True or Unknown.

I think that isn't particularly relevant, because I believe that by
SQL-data they mean the static contents of a database, so of course
only table contents matter.  What we are concerned about is dynamic
behavior within queries and functions.

> Secondly, why are you so concerned about outer join nulls here and not
> for any other column marked NOT NULL?

Primarily because that's an easy way for a column that was marked
NOT NULL to read out as NULL.

>> That's fundamentally inconsistent.  It's like claiming that
>> 'foobarbaz' is a valid value of type numeric as long as it's
>> only in flight within a query and you haven't tried to store it
>> into a table.

> It's like claiming that null is a valid value of type numeric as long as
> it's only in flight within a query and you haven't tried to store it
> into a table with that column marked NOT NULL.

And?  NULL *is* a valid value of type numeric, as well as all other
base types.

> Allowing a null to be stored in a column where the user has specified
> NOT NULL, no matter how the user did that, is unacceptable and I am
> frankly surprised that you are defending it.

What I'm trying to hold onto is the notion that a domain can
meaningfully be considered to be a data type (that is, that a value in
flight can be considered to be of a domain type).  We've been building
the system on that assumption for over twenty years now, and I think
it's pretty deeply ingrained.  I don't understand the consequences
of abandoning it, and I'm not convinced that the spec's model is
sufficiently intellectually rigorous that we can just say "oh, we'll
follow the spec instead of what we've been doing, and it'll be fine".

As a trivial example: our implementation assumes that enforcing a
domain's constraints is to be done by casting the base type value
to the domain type.  Per the above reading of <6.13>, this should
fail to reject nulls, so we'd have to understand and implement
checking of domain constraints in some other way.

Given the exception the spec makes for CAST, I wonder if we shouldn't
just say "NULL is a valid value of every domain type, as well as every
base type.  If you don't like it, too bad; write a separate NOT NULL
constraint for your table column."

                        regards, tom lane

Re: PostgreSQL domains and NOT NULL constraint

От
Tom Lane
Дата:
I wrote:
> Given the exception the spec makes for CAST, I wonder if we shouldn't
> just say "NULL is a valid value of every domain type, as well as every
> base type.  If you don't like it, too bad; write a separate NOT NULL
> constraint for your table column."

After ruminating on this for awhile, here's a straw-man proposal:

1. Domains are data types, with the proviso that NULL is always
a valid value no matter what the domain constraints might say.
Implementation-wise, this'd just require that CoerceToDomain
immediately return any null input without checking the constraints.
This has two big attractions:

(1A) It satisfies the plain language of the SQL spec about how
CAST to a domain type behaves.

(1B) It legitimizes our behavior of allowing nullable outer join
columns, sub-SELECT outputs, etc to be considered to be of the
source column's domain type and not just the base type.

2. In INSERT and UPDATE queries, thumb through the constraints of
any domain-typed target columns to see if any of them are NOT NULL
or CHECK(VALUE IS NOT NULL).  If so, act as though there's a table
NOT NULL constraint on that column.

The idea of point #2 is to have a cheap check that 99% satisfies
what the spec says about not-null constraints on domains.  If we
don't do #2, I think we have to fully recheck all the domain's
constraints during column assignment.  I find that ugly as well
as expensive performance-wise.  It does mean that if you have
some domain constraint that would act to reject NULLs, but it's
spelled in some weird way, it won't reject NULLs.  I don't find
that possibility compelling enough to justify the performance hit
of recomputing every constraint just in case it acts like that.

3. Left unsaid here is whether we should treat assignments to,
e.g., plpgsql variables as acting like assignments to table
columns.  I'm inclined not to, because

(3A) I'm lazy, and I'm also worried that we'd miss places where
this arguably should happen.

(3B) I don't think the SQL spec contemplates any such thing
happening.

(3C) Not doing that means we have a pretty consistent view of
what the semantics are for "values in flight" within a query.
Anything that's not stored in a table is "in flight" and so
can be NULL.

(3D) Again, if you don't like it, there's already ways to attach
a separate NOT NULL constraint to plpgsql variables.


Documenting this in an intelligible fashion might be tricky,
but explaining the exact spec-mandated behavior wouldn't be
much fun either.

Thoughts?

            regards, tom lane



Re: PostgreSQL domains and NOT NULL constraint

От
Isaac Morland
Дата:
On Mon, 23 Oct 2023 at 13:40, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wrote:
> Given the exception the spec makes for CAST, I wonder if we shouldn't
> just say "NULL is a valid value of every domain type, as well as every
> base type.  If you don't like it, too bad; write a separate NOT NULL
> constraint for your table column."

After ruminating on this for awhile, here's a straw-man proposal:
 
[....]
 
3. Left unsaid here is whether we should treat assignments to,
e.g., plpgsql variables as acting like assignments to table
columns.  I'm inclined not to, because

(3A) I'm lazy, and I'm also worried that we'd miss places where
this arguably should happen.

(3B) I don't think the SQL spec contemplates any such thing
happening.

(3C) Not doing that means we have a pretty consistent view of
what the semantics are for "values in flight" within a query.
Anything that's not stored in a table is "in flight" and so
can be NULL.

(3D) Again, if you don't like it, there's already ways to attach
a separate NOT NULL constraint to plpgsql variables.


Documenting this in an intelligible fashion might be tricky,
but explaining the exact spec-mandated behavior wouldn't be
much fun either.

This sounds pretty good.

I'd be OK with only running the CHECK clause on non-NULL values. This would imply that "CHECK (VALUE NOT NULL)" would have exactly the same effect as "CHECK (TRUE)" (i.e., no effect). This might seem insane but it avoids a special case and in any event if somebody wants the NOT NULL behaviour, they can get it by specifying NOT NULL in the CREATE DOMAIN command.

Then domain CHECK constraints are checked anytime a non-NULL value is turned into a domain value, and NOT NULL ones are checked only when storing to a table. CHECK constraints would be like STRICT functions; if the input is NULL, the implementation is not run and the result is NULL (which for a CHECK means accept the input).

Whether I actually think the above is a good idea would require me to read carefully the relevant section of the SQL spec. If it agrees that CHECK () is for testing non-NULL values and NOT NULL is for saying that columns of actual tables can't be NULL, then I would probably agree with my own idea, otherwise perhaps not depending on exactly what it said.

Some possible documentation wording to consider for the CREATE DOMAIN page:

Under "NOT NULL": "Table columns whose data type is this domain may not be NULL, exactly as if NOT NULL had been given in the column specification."

Under "NULL": "This is a noise word indicating the default, which is that the domain does not restrict NULL from occurring in table columns whose data type is this domain."

Under "CHECK (expression)", replacing the first sentence: "CHECK clauses specify integrity constraints or tests which non-NULL values of the domain must satisfy; NULLs are never checked by domain CHECK clauses. To use a domain to prevent a NULL from occurring in a table column, use the NOT NULL clause."

Also, where it says "Expressions evaluating to TRUE or UNKNOWN succeed": Do we really mean "Expressions evaluating to TRUE or NULL succeed"?

It would be nice if we had universally agreed terminology so that we would have one word for the non-NULL things of various data types, and another word for the possibly NULL things that might occur in variable or column.

If we decide we do want "CHECK (VALUE NOT NULL)" to work, then I wonder if we could pass NULL to the constraint at CREATE DOMAIN time, and if it returns FALSE, do exactly what we would have done (set pg_type.typnotnull) if an actual NOT NULL clause had been specified? Then when actually processing domain constraints during a query, we could use the above procedure. I'm thinking about more complicated constraints that evaluate to FALSE for NULL but which are not simply "CHECK (VALUE NOT NULL)".

Is it an error to specify both NULL and NOT NULL? What about CHECK (VALUE NOT NULL) and NULL?

Re: PostgreSQL domains and NOT NULL constraint

От
Pavel Stehule
Дата:


po 23. 10. 2023 v 19:34 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
I wrote:
> Given the exception the spec makes for CAST, I wonder if we shouldn't
> just say "NULL is a valid value of every domain type, as well as every
> base type.  If you don't like it, too bad; write a separate NOT NULL
> constraint for your table column."

After ruminating on this for awhile, here's a straw-man proposal:

1. Domains are data types, with the proviso that NULL is always
a valid value no matter what the domain constraints might say.
Implementation-wise, this'd just require that CoerceToDomain
immediately return any null input without checking the constraints.
This has two big attractions:

(1A) It satisfies the plain language of the SQL spec about how
CAST to a domain type behaves.

(1B) It legitimizes our behavior of allowing nullable outer join
columns, sub-SELECT outputs, etc to be considered to be of the
source column's domain type and not just the base type.

2. In INSERT and UPDATE queries, thumb through the constraints of
any domain-typed target columns to see if any of them are NOT NULL
or CHECK(VALUE IS NOT NULL).  If so, act as though there's a table
NOT NULL constraint on that column.

+1

I think only this interpretation makes sense.


The idea of point #2 is to have a cheap check that 99% satisfies
what the spec says about not-null constraints on domains.  If we
don't do #2, I think we have to fully recheck all the domain's
constraints during column assignment.  I find that ugly as well
as expensive performance-wise.  It does mean that if you have
some domain constraint that would act to reject NULLs, but it's
spelled in some weird way, it won't reject NULLs.  I don't find
that possibility compelling enough to justify the performance hit
of recomputing every constraint just in case it acts like that.

3. Left unsaid here is whether we should treat assignments to,
e.g., plpgsql variables as acting like assignments to table
columns.  I'm inclined not to, because

(3A) I'm lazy, and I'm also worried that we'd miss places where
this arguably should happen.

(3B) I don't think the SQL spec contemplates any such thing
happening.

(3C) Not doing that means we have a pretty consistent view of
what the semantics are for "values in flight" within a query.
Anything that's not stored in a table is "in flight" and so
can be NULL.

(3D) Again, if you don't like it, there's already ways to attach
a separate NOT NULL constraint to plpgsql variables.

Although I don't fully like it, I think ignoring the NOT NULL constraint for plpgsql's variables is a better way, then apply it. Elsewhere there can be issues related to variable's initialization.

Regards

Pavel




 


Documenting this in an intelligible fashion might be tricky,
but explaining the exact spec-mandated behavior wouldn't be
much fun either.

Thoughts?

                        regards, tom lane


Re: PostgreSQL domains and NOT NULL constraint

От
Vik Fearing
Дата:
On 10/23/23 20:36, Isaac Morland wrote:
> Also, where it says "Expressions evaluating to TRUE or UNKNOWN succeed": 
> Do we really mean "Expressions evaluating to TRUE or NULL succeed"?

No, UNKNOWN is the correct nomenclature for booleans.
-- 
Vik Fearing




Re: PostgreSQL domains and NOT NULL constraint

От
Vik Fearing
Дата:
On 10/23/23 18:53, Tom Lane wrote:
> 1. Domains are data types, with the proviso that NULL is always a valid 
> value no matter what the domain constraints might say. 
> Implementation-wise, this'd just require that CoerceToDomain immediately 
> return any null input without checking the constraints. This has two big 
> attractions:


> (1A) It satisfies the plain language of the SQL spec about 
> how CAST to a domain type behaves.


I agree with all of your proposal, except for this part.  I think the 
shortcut in the General Rules of <cast specification> is an oversight 
and I plan on submitting a paper to fix it.  The intention is, in my 
view, clearly to check the constraints upon casting.  What other 
explanation is there since the result type is still the domain's base 
type[*]?


[*] In the standard, not in our superior implementation of it.
-- 
Vik Fearing




Re: PostgreSQL domains and NOT NULL constraint

От
Tom Lane
Дата:
Vik Fearing <vik@postgresfriends.org> writes:
> On 10/23/23 18:53, Tom Lane wrote:
>> (1A) It satisfies the plain language of the SQL spec about 
>> how CAST to a domain type behaves.

> I agree with all of your proposal, except for this part.  I think the 
> shortcut in the General Rules of <cast specification> is an oversight 
> and I plan on submitting a paper to fix it.

Yeah, it might be a bug in the spec, but if so the bug has been there
since SQL92 without anyone noticing.  SQL92 has GR2 as

         2) Case:

            a) If the <cast operand> specifies NULL or if SV is the null
              value, then the result of the <cast specification> is the
              null value.

SQL99 revised the text some, but without changing that outcome.
Then in SQL:2003 they doubled down on the point:

    a) If the <cast operand> specifies NULL, then TV is the null value and
    no further General Rules of this Subclause are applied.

    b) If the <cast operand> specifies an <empty specification>, then TV
    is an empty collection of declared type TD and no further General
    Rules of this Subclause are applied.

    c) If SV is the null value, then TV is the null value and no further
    General Rules of this Subclause are applied.

You're suggesting that nobody noticed that this wording requires NULLs
to skip the domain checks?  Maybe, but I think it must be intentional.
I'll await the committee's reaction with interest.

            regards, tom lane



Re: PostgreSQL domains and NOT NULL constraint

От
Tom Lane
Дата:
Isaac Morland <isaac.morland@gmail.com> writes:
> Then domain CHECK constraints are checked anytime a non-NULL value is
> turned into a domain value, and NOT NULL ones are checked only when storing
> to a table. CHECK constraints would be like STRICT functions; if the input
> is NULL, the implementation is not run and the result is NULL (which for a
> CHECK means accept the input).

Right.

> Whether I actually think the above is a good idea would require me to read
> carefully the relevant section of the SQL spec. If it agrees that CHECK ()
> is for testing non-NULL values and NOT NULL is for saying that columns of
> actual tables can't be NULL, then I would probably agree with my own idea,
> otherwise perhaps not depending on exactly what it said.

The spec doesn't actually allow bare NOT NULL as a domain constraint;
it only has CHECK constraints.  Of course you can write CHECK(VALUE
IS NOT NULL), or more-complicated things that will reject a NULL,
but they're effectively ignored during CAST and applied only when
storing to a table column.

I think we decided to implement NOT NULL because it seemed like an
odd wart not to have it if you could do the CHECK equivalent.
In the light of this new understanding, though, I bet they omitted
it deliberately because it'd be too-obviously-inconsistent behavior.

In any case, we can't drop the NOT NULL option now without breaking
apps.  I think it should continue to behave exactly the same as
"CHECK(VALUE IS NOT NULL)".

> If we decide we do want "CHECK (VALUE NOT NULL)" to work, then I wonder if
> we could pass NULL to the constraint at CREATE DOMAIN time, and if it
> returns FALSE, do exactly what we would have done (set pg_type.typnotnull)
> if an actual NOT NULL clause had been specified?

Maybe, but then ALTER DOMAIN would have to be prepared to update that
flag when adding or dropping constraints.  Perhaps that's better than
checking on-the-fly during DML commands, though.

            regards, tom lane



Re: PostgreSQL domains and NOT NULL constraint

От
Tom Lane
Дата:
I wrote:
> Isaac Morland <isaac.morland@gmail.com> writes:
>> If we decide we do want "CHECK (VALUE NOT NULL)" to work, then I wonder if
>> we could pass NULL to the constraint at CREATE DOMAIN time, and if it
>> returns FALSE, do exactly what we would have done (set pg_type.typnotnull)
>> if an actual NOT NULL clause had been specified?

> Maybe, but then ALTER DOMAIN would have to be prepared to update that
> flag when adding or dropping constraints.  Perhaps that's better than
> checking on-the-fly during DML commands, though.

After further thought I like that idea a lot, but we can't simply
overwrite pg_type.typnotnull without losing track of whether the user
had given a bare NOT NULL constraint.  Instead I think the details
should be like this:

1. Add a bool column "connotnull" (or some such name) to pg_constraint.
Set this to true when the constraint is a domain CHECK constraint that
returns FALSE for NULL input.  (In future we could maintain the flag
for table CHECK constraints too, perhaps, but I don't see value in
that right now.)  This requires assuming that the constraint is
immutable (which we assume already) and that it's okay to evaluate it
on a NULL immediately during CREATE DOMAIN or ALTER DOMAIN ADD
CONSTRAINT.  It seems possible that that could fail, but only with
rather questionable choices of constraints.

2. INSERT/UPDATE enforce not-nullness if pg_type.typnotnull is set
or there is any domain constraint with pg_constraint.connotnull
set.  This still requires thumbing through the constraints at
query start, but the check is cheaper and a good deal more bulletproof
than my previous suggestion of a purely-syntactic check.

We could make query start still cheaper by adding another pg_type
column that is the OR of the associated constraints' connotnull
flags, but I suspect it's not worth the trouble.  The typcache
can probably maintain that info with epsilon extra cost.

A variant approach could be to omit the catalog changes and have
this state be tracked entirely by the typcache.  That'd result in
rather more trial evaluations of the domain constraints on NULLs,
but it would have the advantage of not requiring any constraint
evaluations to occur during CREATE/ALTER DOMAIN, only during startup
of a query that's likely to evaluate them anyway.  That'd reduce
the odds of breaking things thanks to search_path dependencies
and suchlike.

            regards, tom lane



Re: PostgreSQL domains and NOT NULL constraint

От
Matthias van de Meent
Дата:
On Mon, 23 Oct 2023, 19:34 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
>
> I wrote:
> > Given the exception the spec makes for CAST, I wonder if we shouldn't
> > just say "NULL is a valid value of every domain type, as well as every
> > base type.  If you don't like it, too bad; write a separate NOT NULL
> > constraint for your table column."
>
> After ruminating on this for awhile, here's a straw-man proposal:
>
> 1. Domains are data types, with the proviso that NULL is always
> a valid value no matter what the domain constraints might say.
> Implementation-wise, this'd just require that CoerceToDomain
> immediately return any null input without checking the constraints.
> This has two big attractions:

Agreed.

> 2. In INSERT and UPDATE queries, thumb through the constraints of
> any domain-typed target columns to see if any of them are NOT NULL
> or CHECK(VALUE IS NOT NULL).  If so, act as though there's a table
> NOT NULL constraint on that column.

How does this work w.r.t. concurrently created tables that contain the
domain? Right now, you can do something along the lines of the
following due to a lack of locking on domains for new columns/tables
that use said domain, and I believe that this is the main source of
domain constraint violations:

CREATE DOMAIN mydomain text;
CREATE TABLE c (d mydomain);

S1: BEGIN; INSERT INTO c VALUES (''); CREATE TABLE t (d mydomain);
INSERT INTO t VALUES (NULL);

S2: BEGIN; ALTER DOMAIN mydomain SET NOT NULL;
-- waits for S1 to release lock on c

S1: COMMIT;
-- S2's ALTER DOMAIN gets unblocked and succeeds, despite the NULL
value in "t" because that table is invisible to the transaction of
ALTER DOMAIN.

So my base question is, should we then require e.g. SHARE locks on
types that depend on domains when we do DDL that depends on the type,
and SHARE UPDATE EXCLUSIVE when we modify the type?

> The idea of point #2 is to have a cheap check that 99% satisfies
> what the spec says about not-null constraints on domains.  If we
> don't do #2, I think we have to fully recheck all the domain's
> constraints during column assignment.  I find that ugly as well
> as expensive performance-wise.  It does mean that if you have
> some domain constraint that would act to reject NULLs, but it's
> spelled in some weird way, it won't reject NULLs.  I don't find
> that possibility compelling enough to justify the performance hit
> of recomputing every constraint just in case it acts like that.

Makes sense.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)



Re: PostgreSQL domains and NOT NULL constraint

От
Tom Lane
Дата:
Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> On Mon, 23 Oct 2023, 19:34 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
>> After ruminating on this for awhile, here's a straw-man proposal:
>> ...

> How does this work w.r.t. concurrently created tables that contain the
> domain?

It wouldn't change that at all I think.  I had noticed that we'd
probably need to tweak validateDomainConstraint() to ensure it applies
the same semantics that INSERT/UPDATE do --- although with Isaac's
idea to enable better tracking of which constraints will fail on NULL,
maybe just a blind application of the constraint expression will still
be close enough.

I agree that concurrent transactions can create violations of the new
constraint, but (a) that's true now, (b) I have no good ideas about
how to improve it, and (c) it seems like an independent problem.

            regards, tom lane