Обсуждение: Missing semicolumn in anonymous plpgsql block does not raise syntax error

Поиск
Список
Период
Сортировка

Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
Mor Lehr
Дата:
Hi,

I would like to report a potential bug in postgres 15.4, also reproduced on 15.6.

The exact sequence of steps:
Connect to a postgres 15.4 database and run the following statements:

CREATE TABLE foo3(id serial PRIMARY key, txt text);

INSERT INTO foo3 (txt) VALUES ('aaa'),('bbb');

DO $$

DECLARE

l_cnt int;

BEGIN

l_cnt := 1

DELETE FROM foo3 WHERE id=1;

END; $$;


The output you got:

1. The script passes (no error message) even though there's a missing semicolon (;) after "l_cnt := 1"
2. The script doesn't actually delete the record from foo3


This caused us a production issue where we thought changes were applied (script passed successfully) but changes weren't actually applied.


If I move the line "l_cnt := 1" to after the DELETE statement like so:

DO $$

DECLARE

l_cnt int;

BEGIN

DELETE FROM foo3 WHERE id=1;

l_cnt := 1

END; $$;

I get the error - as expected:

SQL Error [42601]: ERROR: syntax error at end of input
  Position: 89


Furthermore, replacing the DELETE statement with an UPDATE statement in the original code does raise an error:

DO $$

DECLARE

l_cnt int;

BEGIN

l_cnt := 1

UPDATE foo3 SET txt='ccc' WHERE id=1;

END; $$;

SQL Error [42601]: ERROR: syntax error at or near "foo3"
  Position: 62

But adding the semicolon - it works correctly with either UPDATE or DELETE.


I ran the original code using the following clients to make sure it's not a client problem:

1. psql

2. DBeaver using standard JDBC drivers

3. Flyway using JDBC drivers




Versions:

PostgreSQL 15.6 (Homebrew) on x86_64-apple-darwin23.2.0, compiled by Apple clang version 15.0.0 (clang-1500.1.0.2.5), 64-bit l - running locally on my MacBook


PostgreSQL 15.4 on aarch64-unknown-linux-gnu, compiled by aarch64-unknown-linux-gnu-gcc (GCC) 9.5.0, 64-bit - running on AWS RDS Aurora


Installed Extensions (On AWS RDS):

 oid       |extname                  |extowner|extnamespace|extrelocatable|extversion|extconfig   |extcondition|
----------+-------------------------+--------+------------+--------------+----------+------------+------------+
     16463|btree_gist               |      10|        2200|true          |1.7       |NULL        |NULL        |
1463651797|deel_password_check_rules|   16399|        2200|false         |1.0       |NULL        |NULL        |
     16464|fuzzystrmatch            |      10|        2200|true          |1.1       |NULL        |NULL        |
 958297705|pg_repack                |      10|        2200|false         |1.4.8     |NULL        |NULL        |
     16465|pg_stat_statements       |      10|        2200|true          |1.9       |NULL        |NULL        |
1463506085|pg_tle                   |      10|  1463506084|false         |1.1.1     |{1463506117}|{""}        |
     16467|pg_trgm                  |      10|        2200|true          |1.6       |NULL        |NULL        |
     16468|pgcrypto                 |      10|        2200|true          |1.3       |NULL        |NULL        |
     14498|plpgsql                  |      10|          11|false         |1.0       |NULL        |NULL        |
     16469|postgres_fdw             |      10|        2200|true          |1.1       |NULL        |NULL        |
     16470|tablefunc                |      10|        2200|true          |1.0       |NULL        |NULL        |
     16471|unaccent                 |      10|        2200|true          |1.1       |NULL        |NULL        |
     16472|uuid-ossp                |      10|        2200|true          |1.1       |NULL        |NULL        |


Please let me know what other information I can provide.


Thanks,

Mor

Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
"David G. Johnston"
Дата:
On Sunday, June 2, 2024, Mor Lehr <mor.lehr@deel.com> wrote:
Hi,

I would like to report a potential bug in postgres 15.4, also reproduced on 15.6.

The exact sequence of steps:
Connect to a postgres 15.4 database and run the following statements:

CREATE TABLE foo3(id serial PRIMARY key, txt text);

INSERT INTO foo3 (txt) VALUES ('aaa'),('bbb');

DO $$

DECLARE

l_cnt int;

BEGIN

l_cnt := 1

DELETE FROM foo3 WHERE id=1;

END; $$;


The output you got:

1. The script passes (no error message) even though there's a missing semicolon (;) after "l_cnt := 1"
2. The script doesn't actually delete the record from foo3



I think you just wrote the equivalent of:

l_cnt := (select 1 as delete from foo3 where id=1);

Which is a valid query.

David J.


Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
Mor Lehr
Дата:
Thanks for the prompt reply.
Can you please refer me to the section in the documentation that describes this behavior?
This (automatically interperting 1 as select 1) is totally an unexpected behavior for me.

Thanks again.
-Mor.


On Sun, Jun 2, 2024, 18:19 David G. Johnston <david.g.johnston@gmail.com> wrote:
On Sunday, June 2, 2024, Mor Lehr <mor.lehr@deel.com> wrote:
Hi,

I would like to report a potential bug in postgres 15.4, also reproduced on 15.6.

The exact sequence of steps:
Connect to a postgres 15.4 database and run the following statements:

CREATE TABLE foo3(id serial PRIMARY key, txt text);

INSERT INTO foo3 (txt) VALUES ('aaa'),('bbb');

DO $$

DECLARE

l_cnt int;

BEGIN

l_cnt := 1

DELETE FROM foo3 WHERE id=1;

END; $$;


The output you got:

1. The script passes (no error message) even though there's a missing semicolon (;) after "l_cnt := 1"
2. The script doesn't actually delete the record from foo3



I think you just wrote the equivalent of:

l_cnt := (select 1 as delete from foo3 where id=1);

Which is a valid query.

David J.


Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
"David G. Johnston"
Дата:
On Sunday, June 2, 2024, Mor Lehr <mor.lehr@deel.com> wrote:
Thanks for the prompt reply.
Can you please refer me to the section in the documentation that describes this behavior?
This (automatically interperting 1 as select 1) is totally an unexpected behavior for me.


“As explained previously, the expression in such a statement is evaluated by means of an SQL SELECT command sent to the main database engine.”

David J.

Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
Mor Lehr
Дата:
Thanks for the reference.
We learn new stuff every day.

You can close the case.
Thanks, Mor


On Sun, Jun 2, 2024, 18:31 David G. Johnston <david.g.johnston@gmail.com> wrote:
On Sunday, June 2, 2024, Mor Lehr <mor.lehr@deel.com> wrote:
Thanks for the prompt reply.
Can you please refer me to the section in the documentation that describes this behavior?
This (automatically interperting 1 as select 1) is totally an unexpected behavior for me.


“As explained previously, the expression in such a statement is evaluated by means of an SQL SELECT command sent to the main database engine.”

David J.

Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> I think you just wrote the equivalent of:
> l_cnt := (select 1 as delete from foo3 where id=1);
> Which is a valid query.

Still another example of the folly of letting AS be optional.
I don't suppose we can ever undo that though.

            regards, tom lane



Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
Erik Wienhold
Дата:
On 2024-06-03 00:18 +0200, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > I think you just wrote the equivalent of:
> > l_cnt := (select 1 as delete from foo3 where id=1);
> > Which is a valid query.
> 
> Still another example of the folly of letting AS be optional.
> I don't suppose we can ever undo that though.

How about inventing an opt-in strict mode (like in Perl or JavaScript)
that prevents certain footguns?  For example, disallowing bare column
labels.

That could be enabled for the current session or transaction:

    SET strict_parsing = { on | off };

Or just for individual routines:

    CREATE PROCEDURE myproc()
        SET strict_parsing = { on | off }
        LANGUAGE plpgsql ...

-- 
Erik



Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
Pavel Stehule
Дата:


po 3. 6. 2024 v 18:46 odesílatel Erik Wienhold <ewie@ewie.name> napsal:
On 2024-06-03 00:18 +0200, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > I think you just wrote the equivalent of:
> > l_cnt := (select 1 as delete from foo3 where id=1);
> > Which is a valid query.
>
> Still another example of the folly of letting AS be optional.
> I don't suppose we can ever undo that though.

How about inventing an opt-in strict mode (like in Perl or JavaScript)
that prevents certain footguns?  For example, disallowing bare column
labels.

That could be enabled for the current session or transaction:

    SET strict_parsing = { on | off };

Or just for individual routines:

    CREATE PROCEDURE myproc()
        SET strict_parsing = { on | off }
        LANGUAGE plpgsql ...


Probably it is not bad idea - it can be generally useful

But I think it is better to introduce a new entry for plpgsql expressions in gram.y.

Unfortunately it is not a compatible change. Years ago was popular to use a pattern

a := tab.a FROM tab

instead correct

a := (SELECT tab.a FROM tab)

or

SELECT tab.a FROM tab INTO a;

Regards

Pavel

 
--
Erik


Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
Mor Lehr
Дата:
How about inventing an opt-in strict mode
 
That can be useful at the session level, because we use anonymous blocks quite often.
I assume if such a setting existed - we would have used it in the original scenario.

Thanks again,
-Mor

On Mon, Jun 3, 2024 at 9:12 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


po 3. 6. 2024 v 18:46 odesílatel Erik Wienhold <ewie@ewie.name> napsal:
On 2024-06-03 00:18 +0200, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > I think you just wrote the equivalent of:
> > l_cnt := (select 1 as delete from foo3 where id=1);
> > Which is a valid query.
>
> Still another example of the folly of letting AS be optional.
> I don't suppose we can ever undo that though.

How about inventing an opt-in strict mode (like in Perl or JavaScript)
that prevents certain footguns?  For example, disallowing bare column
labels.

That could be enabled for the current session or transaction:

    SET strict_parsing = { on | off };

Or just for individual routines:

    CREATE PROCEDURE myproc()
        SET strict_parsing = { on | off }
        LANGUAGE plpgsql ...


Probably it is not bad idea - it can be generally useful

But I think it is better to introduce a new entry for plpgsql expressions in gram.y.

Unfortunately it is not a compatible change. Years ago was popular to use a pattern

a := tab.a FROM tab

instead correct

a := (SELECT tab.a FROM tab)

or

SELECT tab.a FROM tab INTO a;

Regards

Pavel

 
--
Erik


Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
Pavel Stehule
Дата:
Hi

út 4. 6. 2024 v 8:39 odesílatel Mor Lehr <mor.lehr@deel.com> napsal:
How about inventing an opt-in strict mode
 
That can be useful at the session level, because we use anonymous blocks quite often.
I assume if such a setting existed - we would have used it in the original scenario.

Thanks again,
-Mor

On Mon, Jun 3, 2024 at 9:12 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


po 3. 6. 2024 v 18:46 odesílatel Erik Wienhold <ewie@ewie.name> napsal:
On 2024-06-03 00:18 +0200, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > I think you just wrote the equivalent of:
> > l_cnt := (select 1 as delete from foo3 where id=1);
> > Which is a valid query.
>
> Still another example of the folly of letting AS be optional.
> I don't suppose we can ever undo that though.

How about inventing an opt-in strict mode (like in Perl or JavaScript)
that prevents certain footguns?  For example, disallowing bare column
labels.

That could be enabled for the current session or transaction:

    SET strict_parsing = { on | off };

Or just for individual routines:

    CREATE PROCEDURE myproc()
        SET strict_parsing = { on | off }
        LANGUAGE plpgsql ...


Probably it is not bad idea - it can be generally useful

But I think it is better to introduce a new entry for plpgsql expressions in gram.y.

Unfortunately it is not a compatible change. Years ago was popular to use a pattern

a := tab.a FROM tab

instead correct

a := (SELECT tab.a FROM tab)

or

SELECT tab.a FROM tab INTO a;

Regards

Pavel

I wrote an experimental patch that enforces strict syntax for PL/pgSQL expressions. This patch is a proof concept and shows impacts of the change (check-world tests passed)

Unfortunately it introduces break compatibility - with this patch the odd syntax (that is undocumented) is not supported. Something like

DECLARE _x int := x FROM foo WHERE id = _arg;

should not be written in strict mode;

This patch very well fix reported issue:

(2024-06-10 12:06:43) postgres=# CREATE TABLE foo3(id serial PRIMARY key, txt text);
CREATE TABLE
(2024-06-10 12:07:13) postgres=# INSERT INTO foo3 (txt) VALUES ('aaa'),('bbb');
INSERT 0 2
(2024-06-10 12:07:33) postgres=# DO $$
DECLARE
    l_cnt int;
BEGIN
    l_cnt := 1
    DELETE FROM foo3 WHERE id=1;
END; $$;
ERROR:  syntax error at or near "DELETE"
LINE 11:     DELETE FROM foo3 WHERE id=1;
             ^

Personally, I think it can be a strong step forward (we can use deeper integration of plpgsql with SQL parser which was not possible before).

Because it introduces a compatibility break I don't propose to use it by default. I see two possible ways, how this check can be used:

1. we can use plpgsql.extra_errors ( https://www.postgresql.org/docs/current/plpgsql-development-tips.html#PLPGSQL-EXTRA-CHECKS ) and introduce a check named (maybe) strict_expr_syntax. Because in this case the warning cannot be raised, then this check cannot be used in plpgsql.extra_warnings. I think it can work nicely.

2. if @1 will not be possible, the minimalist implementation can be based on a new public entry to SQL parser.  In this case, I can do the proposed check at least from plpgsql_check.

Do you see any other possibilities?

This patch is just a proof concept. I didn't implement any mechanism to switch from default mode to strict mode (I don't propose strict mode as default) now.

I think it can increase the robustness of plpgsql, on the other hand it introduces compatibility breaks and I understand related problems.

The change of definition of PLpgSQL_Expr has an impact on OPEN and CASE
PLpgSQL statements that use multicolumn results.

Regards

Pavel
 

 
--
Erik


Вложения

Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
Pavel Stehule
Дата:


po 10. 6. 2024 v 13:56 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:
Hi

út 4. 6. 2024 v 8:39 odesílatel Mor Lehr <mor.lehr@deel.com> napsal:
How about inventing an opt-in strict mode
 
That can be useful at the session level, because we use anonymous blocks quite often.
I assume if such a setting existed - we would have used it in the original scenario.

Thanks again,
-Mor

On Mon, Jun 3, 2024 at 9:12 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


po 3. 6. 2024 v 18:46 odesílatel Erik Wienhold <ewie@ewie.name> napsal:
On 2024-06-03 00:18 +0200, Tom Lane wrote:
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > I think you just wrote the equivalent of:
> > l_cnt := (select 1 as delete from foo3 where id=1);
> > Which is a valid query.
>
> Still another example of the folly of letting AS be optional.
> I don't suppose we can ever undo that though.

How about inventing an opt-in strict mode (like in Perl or JavaScript)
that prevents certain footguns?  For example, disallowing bare column
labels.

That could be enabled for the current session or transaction:

    SET strict_parsing = { on | off };

Or just for individual routines:

    CREATE PROCEDURE myproc()
        SET strict_parsing = { on | off }
        LANGUAGE plpgsql ...


Probably it is not bad idea - it can be generally useful

But I think it is better to introduce a new entry for plpgsql expressions in gram.y.

Unfortunately it is not a compatible change. Years ago was popular to use a pattern

a := tab.a FROM tab

instead correct

a := (SELECT tab.a FROM tab)

or

SELECT tab.a FROM tab INTO a;

Regards

Pavel

I wrote an experimental patch that enforces strict syntax for PL/pgSQL expressions. This patch is a proof concept and shows impacts of the change (check-world tests passed)

Unfortunately it introduces break compatibility - with this patch the odd syntax (that is undocumented) is not supported. Something like

DECLARE _x int := x FROM foo WHERE id = _arg;

should not be written in strict mode;

This patch very well fix reported issue:

(2024-06-10 12:06:43) postgres=# CREATE TABLE foo3(id serial PRIMARY key, txt text);
CREATE TABLE
(2024-06-10 12:07:13) postgres=# INSERT INTO foo3 (txt) VALUES ('aaa'),('bbb');
INSERT 0 2
(2024-06-10 12:07:33) postgres=# DO $$
DECLARE
    l_cnt int;
BEGIN
    l_cnt := 1
    DELETE FROM foo3 WHERE id=1;
END; $$;
ERROR:  syntax error at or near "DELETE"
LINE 11:     DELETE FROM foo3 WHERE id=1;
             ^

Personally, I think it can be a strong step forward (we can use deeper integration of plpgsql with SQL parser which was not possible before).

Because it introduces a compatibility break I don't propose to use it by default. I see two possible ways, how this check can be used:

1. we can use plpgsql.extra_errors ( https://www.postgresql.org/docs/current/plpgsql-development-tips.html#PLPGSQL-EXTRA-CHECKS ) and introduce a check named (maybe) strict_expr_syntax. Because in this case the warning cannot be raised, then this check cannot be used in plpgsql.extra_warnings. I think it can work nicely.

2. if @1 will not be possible, the minimalist implementation can be based on a new public entry to SQL parser.  In this case, I can do the proposed check at least from plpgsql_check.

Do you see any other possibilities?

This patch is just a proof concept. I didn't implement any mechanism to switch from default mode to strict mode (I don't propose strict mode as default) now.

I think it can increase the robustness of plpgsql, on the other hand it introduces compatibility breaks and I understand related problems.

The change of definition of PLpgSQL_Expr has an impact on OPEN and CASE
PLpgSQL statements that use multicolumn results.

Note - the SQL/PSM standard allow syntax

SET var = (SELECT col FROM tab) 

as shorter variant of 

SELECT col INTO var FROM tab ;

so

var := (SELECT col FROM tab)

is +/- ANSI/SQL syntax. It is not my invention. The subquery is used as a guard against returning multiple rows.

The proprietary Postgre syntax is a little bit faster - 80us x 95 us, doesn't raise an exception when returning more rows (I am not sure if this is any benefit or not - it is possibly dangerous, but it can reduce the necessity of subtransaction in some patterns). Instead of proprietary syntax, SELECT INTO can be used for these cases.

Regards

Pavel



Regards

Pavel
 

 
--
Erik


Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
Pavel Stehule
Дата:
Hi

ne 2. 6. 2024 v 18:25 odesílatel Mor Lehr <mor.lehr@deel.com> napsal:
Thanks for the reference.
We learn new stuff every day.

You can close the case.
Thanks, Mor


plpgsql_check can now detect this issue


Regards

Pavel


On Sun, Jun 2, 2024, 18:31 David G. Johnston <david.g.johnston@gmail.com> wrote:
On Sunday, June 2, 2024, Mor Lehr <mor.lehr@deel.com> wrote:
Thanks for the prompt reply.
Can you please refer me to the section in the documentation that describes this behavior?
This (automatically interperting 1 as select 1) is totally an unexpected behavior for me.


“As explained previously, the expression in such a statement is evaluated by means of an SQL SELECT command sent to the main database engine.”

David J.

Re: Missing semicolumn in anonymous plpgsql block does not raise syntax error

От
Mor Lehr
Дата:
Thanks for not letting this be forgotten :)
Unfortunately, plpgsql_check does not help me because it's not supported on AWS.
I also saw you wrote a patch which is still pending, I will probably have to wait for that.

Thanks again,
Mor

On Thu, Feb 6, 2025 at 7:48 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

ne 2. 6. 2024 v 18:25 odesílatel Mor Lehr <mor.lehr@deel.com> napsal:
Thanks for the reference.
We learn new stuff every day.

You can close the case.
Thanks, Mor


plpgsql_check can now detect this issue


Regards

Pavel


On Sun, Jun 2, 2024, 18:31 David G. Johnston <david.g.johnston@gmail.com> wrote:
On Sunday, June 2, 2024, Mor Lehr <mor.lehr@deel.com> wrote:
Thanks for the prompt reply.
Can you please refer me to the section in the documentation that describes this behavior?
This (automatically interperting 1 as select 1) is totally an unexpected behavior for me.


“As explained previously, the expression in such a statement is evaluated by means of an SQL SELECT command sent to the main database engine.”

David J.