Re: COPY WHERE clause generated/system column reference
| От | Masahiko Sawada | 
|---|---|
| Тема | Re: COPY WHERE clause generated/system column reference | 
| Дата | |
| Msg-id | CAD21AoB_X9e1zcfONnZ-nfzOX==9FULD7DX03y35r+bfggfKpA@mail.gmail.com обсуждение исходный текст  | 
		
| Ответ на | COPY WHERE clause generated/system column reference (jian he <jian.universality@gmail.com>) | 
| Ответы | 
                	
            		Re: COPY WHERE clause generated/system column reference
            		
            		 | 
		
| Список | pgsql-hackers | 
On Mon, Oct 27, 2025 at 1:21 AM jian he <jian.universality@gmail.com> wrote:
>
> hi.
>
> CREATE TABLE gtest0 (a int, b int GENERATED ALWAYS AS (a + 1) VIRTUAL);
> copy gtest0 from stdin where (b <> 1);
> 0
> \.
>
> ERROR:  unexpected virtual generated column reference
> CONTEXT:  COPY gtest0, line 1: "0"
>
> We need to apply expand_generated_columns_in_expr to the whereClause in DoCopy.
> However, handling STORED generated columns appears to be less straightforward.
>
> currently:
> ExecQual(cstate->qualexpr, econtext))
> happen before
> ExecComputeStoredGenerated.
>
> when calling ExecQual, the stored generated column values have not been
> populated yet. so we may need ExecComputeStoredGenerated beforehand.
> Since ExecComputeStoredGenerated is likely expensive, I added logic to detect
> whether the WHERE clause actually have stored generated columns reference before
> calling it.
While I agree we can improve the error message in that case as the
message "unexpected virtual generated column reference" isn't helpful
much, I'm not sure that generated column values should be considered
when filtering rows by the WHERE clause. The documentation[1] says:
A stored generated column is computed when it is written (inserted or
updated) and occupies storage as if it were a normal column. A virtual
generated column occupies no storage and is computed when it is read.
The proposed patch (the 0002 patch) allows COPY FROM ... WHERE to
filter rows by checking tuples including generated column values but
it's somewhat odd as it seems not to be the time of reading tuples
from a table.
Also, the patch calls ExecComputeStoredGenerated() before ExecQual(),
which is also before we trigger the BEFORE INSERT trigger. It clearly
violates what the documentation describes[1]:
Generated columns are, conceptually, updated after BEFORE triggers
have run. Therefore, changes made to base columns in a BEFORE trigger
will be reflected in generated columns. But conversely, it is not
allowed to access generated columns in BEFORE triggers.
For example, the tuples passed to a BEFORE INSERT trigger varies
depending on the WHERE clause as follows:
-- preparation
create table t (a int, s int generated always as (a + 10) stored);
create table tt (a int, s int);
create function trig_fn() returns trigger as
$$
begin
    insert into tt select NEW.*;
    return NEW;
end;
$$ language plpgsql;
create trigger trig before insert on t for each row execute function trig_fn();
-- copy a row without the WHERE clause.
copy t from program 'echo 1';
table tt;
 a | s
---+---
 1 |
(1 row)
-- copy a row with the where clause
copy t from program 'echo 1' where s > 0;
table tt;
 a | s
---+----
 1 |
 1 | 11
> generated column allow tableoid system column reference, COPY WHERE clause also
> allow tableoid column reference, should be fine.
>
> please check the attached file:
> v1-0001 fix COPY WHERE with system column reference
It seems to make sense to disallow users to specify system columns in
the WHERE clause of COPY FROM. But why do we need to have an exception
for tableoid? In the context of COPY FROM, specifying tableoid doesn't
not make sense to me as tuples don't come from any relations. If we
accept tableoid, I think it's better to explain why here.
Regards,
[1] https://www.postgresql.org/docs/devel/ddl-generated-columns.html#DDL-GENERATED-COLUMNS
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
		
	В списке pgsql-hackers по дате отправления: