Обсуждение: SERIAL datatype column skipping values.
Hi all,
Please check the below behavior for the "SERIAL" datatype.
postgres=# CREATE TABLE t1(c1 int, c2 serial);
CREATE TABLE
postgres=# insert into t1 values (generate_series(1,3));
CREATE TABLE
postgres=# insert into t1 values (generate_series(1,3));
INSERT 0 3
postgres=# insert into t1 values (generate_series(4,6));
INSERT 0 3
postgres=# select * from t1;
c1 | c2
----+----
1 | 1
2 | 2
3 | 3
4 | 5
5 | 6
6 | 7
(6 rows)
postgres=# insert into t1 values (generate_series(4,6));
INSERT 0 3
postgres=# select * from t1;
c1 | c2
----+----
1 | 1
2 | 2
3 | 3
4 | 5
5 | 6
6 | 7
(6 rows)
In this above case, the serial column "c2" is skipping the value "4" in select output.
Is this an expected behavior?
With Regards,
Prabhat Kumar Sahu
EnterpriseDB: http://www.enterprisedb.com
On 3/11/20 11:15 AM, Prabhat Sahu wrote:
> Hi all,
> Please check the below behavior for the "SERIAL" datatype.
>
> [...]
>
> In this above case, the serial column "c2" is skipping the value "4" in
> select output.
> Is this an expected behavior?
Curious, it seems like DEFAULT expressions of a table are executed an
extra time if a set returning function is used like in your example. And
the SERIAL type is implemented using DEFAULT.
On the other hand if you use "INSERT ... SELECT" the DEFAULT expression
is only executed once per row inserted.
# CREATE FUNCTION test_default() RETURNS int LANGUAGE plpgsql AS $$
BEGIN
RAISE NOTICE 'Ran test_default()';
RETURN 42;
END
$$;
CREATE FUNCTION
# CREATE TABLE t2 (c1 int, c2 int DEFAULT test_default());
CREATE TABLE
# INSERT INTO t2 VALUES (generate_series(1,2));
NOTICE: Ran test_default()
NOTICE: Ran test_default()
NOTICE: Ran test_default()
INSERT 0 2
# INSERT INTO t2 SELECT generate_series(1,2);
NOTICE: Ran test_default()
NOTICE: Ran test_default()
INSERT 0 2
Andreas
Andreas Karlsson <andreas@proxel.se> writes:
> On 3/11/20 11:15 AM, Prabhat Sahu wrote:
>> Is this an expected behavior?
> Curious, it seems like DEFAULT expressions of a table are executed an
> extra time if a set returning function is used like in your example. And
> the SERIAL type is implemented using DEFAULT.
Yeah, it's the same as if you do
regression=# select generate_series(1,2), test_default();
NOTICE: Ran test_default()
NOTICE: Ran test_default()
NOTICE: Ran test_default()
generate_series | test_default
-----------------+--------------
1 | 42
2 | 42
(2 rows)
The generated plan is
regression=# explain verbose select generate_series(1,2), test_default();
QUERY PLAN
-------------------------------------------------
ProjectSet (cost=0.00..0.28 rows=2 width=8)
Output: generate_series(1, 2), test_default()
-> Result (cost=0.00..0.01 rows=1 width=0)
(3 rows)
and if you read nodeProjectSet.c you'll see that it needs to evaluate
the target list three times. On the third iteration, generate_series()
returns isdone == ExprEndResult indicating that it has no more results,
so we don't emit an output tuple --- but we still run test_default()
while scanning the tlist.
Possibly the planner should try to avoid putting volatile expressions
into ProjectSet's tlist. On the other hand, it's worked this way for
an awfully long time, so I wonder if anyone is relying on the behavior.
Even in versions before we used ProjectSet nodes, you still see three
calls to the volatile function.
Anyway, to get back to the OP's implied question, no you should never
assume that a SERIAL column's values won't have holes in the sequence.
Rolled-back transactions will have that effect in any case.
regards, tom lane