Обсуждение: Avoid corrupting DefElem nodes when parsing publication_names and publish options

Поиск
Список
Период
Сортировка

Avoid corrupting DefElem nodes when parsing publication_names and publish options

От
sunil s
Дата:
Hello Hackers,

I noticed that two call sites of SplitIdentifierString() were not
following the API contract documented in varlena.c, which states:
  • rawstring: the input string; must be overwritable! On return, it's
               been modified to contain the separated identifiers.
  • namelist: filled with a palloc'd list of pointers to identifiers within
              rawstring.

The function modifies the input string in-place (replacing separators
with null terminators) and returns a list containing pointers directly
into that modified string.
The two problematic call sites were:
1. parse_publication_options() in publicationcmds.c - passed the result
   of defGetString() directly without copying.
2. pgoutput_startup() in pgoutput.c - passed strVal(defel->arg) directly
   without copying.
All other callers of SplitIdentifierString() in the codebase correctly
use pstrdup() or equivalent (like text_to_cstring()) before calling
the function.
While these two cases happen to work in practice because:
- In publicationcmds.c, the parsed list is used immediately within the
  same loop iteration and then discarded
- In pgoutput.c, the DefElem nodes remain valid for the connection lifetime
...it's still incorrect to rely on this, and the code would break if
someone later tried to use the results after freeing the options list,
or if the memory management changed.
The attached patch adds pstrdup() to both call sites for consistency
and correctness.

Thanks,
Sunil Seetharama
Вложения

Re: Avoid corrupting DefElem nodes when parsing publication_names and publish options

От
sunil s
Дата:
Here is the reproduction of this issue.

As per the official documentation, creating a publication with the following syntax will corrupt the option list('insert, update, delete')
>  CREATE PUBLICATION mypublication FOR ALL TABLES WITH (publish ='insert, update, delete');

By attaching a debugger to parse_publication_options(), we can verify that the option list is modified after the call to splitIdentifierString().

NOTE: Using double quotes (" "), the functionality works correctly.

Thanks & Regards,
Sunil S

Re:Avoid corrupting DefElem nodes when parsing publication_names and publish options

От
"zengman"
Дата:
Hi,

I have checked the relevant invocations of `SplitIdentifierString`, 
and there are indeed issues with these two places – presumably an oversight during the initial addition. 
I approve of your modification; perhaps you could set the status to `Ready for Committer`.

--
Regards,
Man Zeng
www.openhalo.org

Re: Avoid corrupting DefElem nodes when parsing publication_names and publish options

От
Henson Choi
Дата:
Analysis of CF 6339: DefElem corruption in publication option parsing

The original report showed that DefElem is corrupted after SplitIdentifierString()
using a debugger. I looked further to see if this corruption actually matters,
and found that the publicationcmds.c case is a real bug—not just an API
contract violation.


The problem
-----------

The original commit message says both cases "happen to work in practice".
This is true for pgoutput.c, but not for publicationcmds.c.

In publicationcmds.c, after parse_publication_options() corrupts the DefElem
string, EventTriggerCollectSimpleCommand() copies the statement using
copyObject(). Since copyObject() uses pstrdup() internally, it only copies
up to the first NULL byte—losing everything after the first comma.

Here's the call sequence in AlterPublicationOptions():

    parse_publication_options()    <- corrupts defel->arg: "insert\0update\0delete"
        ...
    EventTriggerCollectSimpleCommand(stmt)  <- copyObject(stmt) sees truncated string

Event trigger functions then receive "insert" instead of "insert,update,delete".


Memory lifetime
---------------

publicationcmds.c:
- DefElem: allocated in MessageContext, freed when command completes
- Copy: allocated in EventTriggerContext by copyObject(), freed when trigger completes
- Corruption happens before copyObject(), so the copy is already truncated

pgoutput.c:
- DefElem: allocated in cmd_context (child of TopMemoryContext)
- cmd_context is not freed until walsender exits, but the allocation is small
  and happens only once per session, so no memory leak concern
- DefElem itself is never accessed after parsing, so no problem


Reproducing the bug
-------------------

I wrote a small extension that demonstrates this. It provides two functions:
get_publish_option_with_bug() simulates the corruption, get_publish_option_fixed()
shows the correct behavior.

See attached bug_simulator.tgz.

To test:

    $ tar xzf bug_simulator.tgz
    $ cd bug_simulator
    $ make && sudo make install

    $ initdb -D /tmp/pgdata
    $ pg_ctl -D /tmp/pgdata start
    $ psql -d postgres

    CREATE EXTENSION bug_simulator;

    CREATE TABLE bug_test_log (
        id SERIAL PRIMARY KEY,
        command_tag TEXT,
        buggy_result TEXT,
        fixed_result TEXT
    );

    CREATE TABLE test_table (id INT PRIMARY KEY);

    CREATE FUNCTION test_bug_behavior()
    RETURNS event_trigger LANGUAGE plpgsql AS $$
    DECLARE obj RECORD;
    BEGIN
        FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
        LOOP
            INSERT INTO bug_test_log (command_tag, buggy_result, fixed_result)
            VALUES (
                obj.command_tag,
                get_publish_option_with_bug(obj.command),
                get_publish_option_fixed(obj.command)
            );
        END LOOP;
    END;
    $$;

    CREATE EVENT TRIGGER bug_test_trigger ON ddl_command_end
        WHEN TAG IN ('CREATE PUBLICATION', 'ALTER PUBLICATION')
        EXECUTE FUNCTION test_bug_behavior();

    CREATE PUBLICATION test_pub FOR TABLE test_table
        WITH (publish = 'insert,update,delete');

    ALTER PUBLICATION test_pub SET (publish = 'insert,update');

    SELECT command_tag, buggy_result, fixed_result FROM bug_test_log;

Result:

        command_tag     | buggy_result |    fixed_result
    --------------------+--------------+----------------------
     CREATE PUBLICATION | insert       | insert,update,delete
     ALTER PUBLICATION  | insert       | insert,update


Conclusion
----------

- publicationcmds.c: real bug, affects event triggers
- pgoutput.c: harmless, just API cleanup

The bug in publicationcmds.c is minor in practice—it only affects users who
have event triggers on publication DDL and use extensions that extract option
values from pg_ddl_command. This is an uncommon case, but it is a real bug
nonetheless.

The patch should be applied.

2025년 12월 24일 (수) PM 8:42, sunil s <sunilfeb26@gmail.com>님이 작성:
Here is the reproduction of this issue.

As per the official documentation, creating a publication with the following syntax will corrupt the option list('insert, update, delete')
>  CREATE PUBLICATION mypublication FOR ALL TABLES WITH (publish ='insert, update, delete');

By attaching a debugger to parse_publication_options(), we can verify that the option list is modified after the call to splitIdentifierString().

NOTE: Using double quotes (" "), the functionality works correctly.

Thanks & Regards,
Sunil S
Вложения

Re: Avoid corrupting DefElem nodes when parsing publication_names and publish options

От
Henson Choi
Дата:
Re: Analysis of CF 6339: DefElem corruption in publication option parsing

My earlier analysis used incorrect debugging data. Here is a corrected version.

The original report showed that DefElem is corrupted after SplitIdentifierString()
using a debugger. I looked further to see if this corruption actually matters,
and found that the publicationcmds.c case is a real bug—not just an API
contract violation.


The problem
-----------

The original commit message says both cases "happen to work in practice".
This is true for pgoutput.c, but not for publicationcmds.c.

In publicationcmds.c, after parse_publication_options() corrupts the DefElem
string, EventTriggerCollectSimpleCommand() copies the statement using
copyObject(). Since copyObject() uses pstrdup() internally, it only copies
up to the first NULL byte—losing everything after the first comma.

Here's the call sequence in AlterPublicationOptions():

    parse_publication_options()    <- corrupts defel->arg: "insert\0update\0delete"
        ...
    EventTriggerCollectSimpleCommand(stmt)  <- copyObject(stmt) sees truncated string

Event trigger functions then receive "insert" instead of "insert,update,delete".


Memory lifetime
---------------

publicationcmds.c:
- DefElem: allocated in MessageContext, freed when command completes
- Copy: allocated in EventTriggerContext by copyObject(), freed when trigger completes
- Corruption happens before copyObject(), so the copy is already truncated

pgoutput.c:
- DefElem: allocated in cmd_context (child of TopMemoryContext)
- cmd_context is not freed until walsender exits, but the allocation is small
  and happens only once per session, so no memory leak concern
- DefElem itself is never accessed after parsing, so no problem


Reproducing the bug
-------------------

I wrote a small extension that reads the publish option from pg_ddl_command
in an event trigger. On unpatched servers, the value is truncated.

See attached bug_simulator.tgz.

To test on an unpatched server:

    $ tar xzf bug_simulator.tgz
    $ cd bug_simulator
    $ make && sudo make install

    $ psql -d postgres

    CREATE EXTENSION bug_simulator;

    CREATE TABLE test_log (id SERIAL, command_tag TEXT, publish_value TEXT);
    CREATE TABLE test_table (id INT PRIMARY KEY);

    CREATE FUNCTION log_publish_option()
    RETURNS event_trigger LANGUAGE plpgsql AS $$
    DECLARE obj RECORD;
    BEGIN
      FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands() LOOP
        INSERT INTO test_log (command_tag, publish_value)
        VALUES (obj.command_tag, get_publish_option(obj.command));
      END LOOP;
    END;
    $$;

    CREATE EVENT TRIGGER pub_trigger ON ddl_command_end
        WHEN TAG IN ('CREATE PUBLICATION', 'ALTER PUBLICATION')
        EXECUTE FUNCTION log_publish_option();

    CREATE PUBLICATION test_pub FOR TABLE test_table
        WITH (publish = 'insert,update,delete');

    SELECT command_tag, publish_value FROM test_log;

Result on unpatched server:

        command_tag     | publish_value
    --------------------+---------------
     CREATE PUBLICATION | insert

Result on patched server:

        command_tag     | publish_value
    --------------------+-----------------------
     CREATE PUBLICATION | insert,update,delete


Conclusion
----------

- publicationcmds.c: real bug, affects event triggers
- pgoutput.c: harmless, just API cleanup

The bug in publicationcmds.c is minor in practice—it only affects users who
have event triggers on publication DDL and use extensions that extract option
values from pg_ddl_command. This is an uncommon case, but it is a real bug
nonetheless.

The patch should be applied.

2025년 12월 28일 (일) PM 9:18, Henson Choi <assam258@gmail.com>님이 작성:
Analysis of CF 6339: DefElem corruption in publication option parsing

The original report showed that DefElem is corrupted after SplitIdentifierString()
using a debugger. I looked further to see if this corruption actually matters,
and found that the publicationcmds.c case is a real bug—not just an API
contract violation.


The problem
-----------

The original commit message says both cases "happen to work in practice".
This is true for pgoutput.c, but not for publicationcmds.c.

In publicationcmds.c, after parse_publication_options() corrupts the DefElem
string, EventTriggerCollectSimpleCommand() copies the statement using
copyObject(). Since copyObject() uses pstrdup() internally, it only copies
up to the first NULL byte—losing everything after the first comma.

Here's the call sequence in AlterPublicationOptions():

    parse_publication_options()    <- corrupts defel->arg: "insert\0update\0delete"
        ...
    EventTriggerCollectSimpleCommand(stmt)  <- copyObject(stmt) sees truncated string

Event trigger functions then receive "insert" instead of "insert,update,delete".


Memory lifetime
---------------

publicationcmds.c:
- DefElem: allocated in MessageContext, freed when command completes
- Copy: allocated in EventTriggerContext by copyObject(), freed when trigger completes
- Corruption happens before copyObject(), so the copy is already truncated

pgoutput.c:
- DefElem: allocated in cmd_context (child of TopMemoryContext)
- cmd_context is not freed until walsender exits, but the allocation is small
  and happens only once per session, so no memory leak concern
- DefElem itself is never accessed after parsing, so no problem


Reproducing the bug
-------------------

I wrote a small extension that demonstrates this. It provides two functions:
get_publish_option_with_bug() simulates the corruption, get_publish_option_fixed()
shows the correct behavior.

See attached bug_simulator.tgz.

To test:

    $ tar xzf bug_simulator.tgz
    $ cd bug_simulator
    $ make && sudo make install

    $ initdb -D /tmp/pgdata
    $ pg_ctl -D /tmp/pgdata start
    $ psql -d postgres

    CREATE EXTENSION bug_simulator;

    CREATE TABLE bug_test_log (
        id SERIAL PRIMARY KEY,
        command_tag TEXT,
        buggy_result TEXT,
        fixed_result TEXT
    );

    CREATE TABLE test_table (id INT PRIMARY KEY);

    CREATE FUNCTION test_bug_behavior()
    RETURNS event_trigger LANGUAGE plpgsql AS $$
    DECLARE obj RECORD;
    BEGIN
        FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
        LOOP
            INSERT INTO bug_test_log (command_tag, buggy_result, fixed_result)
            VALUES (
                obj.command_tag,
                get_publish_option_with_bug(obj.command),
                get_publish_option_fixed(obj.command)
            );
        END LOOP;
    END;
    $$;

    CREATE EVENT TRIGGER bug_test_trigger ON ddl_command_end
        WHEN TAG IN ('CREATE PUBLICATION', 'ALTER PUBLICATION')
        EXECUTE FUNCTION test_bug_behavior();

    CREATE PUBLICATION test_pub FOR TABLE test_table
        WITH (publish = 'insert,update,delete');

    ALTER PUBLICATION test_pub SET (publish = 'insert,update');

    SELECT command_tag, buggy_result, fixed_result FROM bug_test_log;

Result:

        command_tag     | buggy_result |    fixed_result
    --------------------+--------------+----------------------
     CREATE PUBLICATION | insert       | insert,update,delete
     ALTER PUBLICATION  | insert       | insert,update


Conclusion
----------

- publicationcmds.c: real bug, affects event triggers
- pgoutput.c: harmless, just API cleanup

The bug in publicationcmds.c is minor in practice—it only affects users who
have event triggers on publication DDL and use extensions that extract option
values from pg_ddl_command. This is an uncommon case, but it is a real bug
nonetheless.

The patch should be applied.

2025년 12월 24일 (수) PM 8:42, sunil s <sunilfeb26@gmail.com>님이 작성:
Here is the reproduction of this issue.

As per the official documentation, creating a publication with the following syntax will corrupt the option list('insert, update, delete')
>  CREATE PUBLICATION mypublication FOR ALL TABLES WITH (publish ='insert, update, delete');

By attaching a debugger to parse_publication_options(), we can verify that the option list is modified after the call to splitIdentifierString().

NOTE: Using double quotes (" "), the functionality works correctly.

Thanks & Regards,
Sunil S
Вложения