logical decoding issue with concurrent ALTER TYPE

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема logical decoding issue with concurrent ALTER TYPE
Дата
Msg-id CAD21AoAenVqiMjpN-PvGHL1N9DWnHSq673bfgr6phmBUzx=kLQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: logical decoding issue with concurrent ALTER TYPE  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
Hi all,

A colleague Drew Callahan (in CC) has discovered that the logical
decoding doesn't handle syscache invalidation messages properly that
are generated by other transactions. Here is example (I've attached a
patch for isolation test),

-- Setup
CREATE TYPE comp AS (f1 int, f2 text);
CREATE TABLE tbl3(val1 int, val2 comp);
SELECT pg_create_logical_replication_slot('s', 'test_decoding');

-- Session 1
BEGIN;
INSERT INTO tbl3 (val1, val2) VALUES (1, (1, '2'));

    -- Session 2
    ALTER TYPE comp ADD ATTRIBUTE f3 int;

INSERT INTO tbl3 (val1, val2) VALUES (1, (1, '2', 3));
COMMIT;

pg_logical_slot_get_changes() returns:

BEGIN
table public.tbl3: INSERT: val1[integer]:1 val2[comp]:'(1,2)'
table public.tbl3: INSERT: val1[integer]:1 val2[comp]:'(1,2)'
COMMIT

However, the logical decoding should reflect the result of ALTER TYPE
and the val2 of the second INSERT output should be '(1,2,3)'.

The root cause of this behavior is that while ALTER TYPE can be run
concurrently to INSERT, the logical decoding doesn't handle cache
invalidation properly, and it got a cache hit of stale data (of
typecache in this case). Unlike snapshots that are stored in the
transaction’s reorder buffer changes, the invalidation messages of
other transactions are not distributed. As a result, the snapshot
becomes moot when we get a cache hit of stale data due to not
processing the invalidation messages again. This is not an issue for
ALTER TABLE and the like due to 2 phase locking and taking an
AccessExclusive lock. The issue with DMLs and ALTER TYPE has been
discovered but there might be other similar cases.

As far as I tested, this issue has existed since v9.4, where the
logical decoding was introduced, so it seems to be a long-standing
issue.

The simplest fix would be to invalidate all caches when setting a new
historical snapshot (i.e. applying CHANGE_INTERNAL_SNAPSHOT) but we
end up invalidating other caches unnecessarily too.

A better fix would be that when decoding the commit of a catalog
changing transaction, we distribute invalidation messages to other
concurrent transactions, like we do for snapshots. But we might not
need to distribute all types of invalidation messages, though.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Julien Rouhaud
Дата:
Сообщение: Re: How to add a new operator for parser?
Следующее
От: jacktby jacktby
Дата:
Сообщение: Re: How to add a new operator for parser?