Обсуждение: BUG #17231: ERROR: tuple concurrently updated

Поиск
Список
Период
Сортировка

BUG #17231: ERROR: tuple concurrently updated

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      17231
Logged by:          Eugene Morozov
Email address:      jmv@emorozov.net
PostgreSQL version: 14.0
Operating system:   Arch Linux Linux 5.14.12-arch1-1 x86_64
Description:

Hello,
I'm working on an application that interacts with PostgreSQL in a very
peculiar way, but anyway, I think this shouldn't lead to errors like this. I
managed to reproduce it reliably on PostgreSQL 12.4, 13.4 and 14.0 using the
following steps:
1. Create some database (jmv in this case) and populate it using the
following statements:

create table a(id serial primary key, created timestamp default
current_timestamp, val integer);
create table b(id serial primary key, a_id integer references a(id), val
integer);
insert into a(val) select generate_series(0, 600000);
insert into b(a_id, val) select generate_series(1, 600000),
generate_series(300000, 900000);
create materialized view ab as select a.id as aid, b.id as bid, b.val as val
from a join b on b.a_id = a.id;
create unique index unique_aid_idx on ab(aid);

2. Create 3 roles: jmv, django, web_admin

3. Create two python scripts:

# crash.py
import psycopg2

conn = psycopg2.connect("dbname='jmv' user='jmv' host='/tmp'")

for i in range(10000):
    with conn.cursor() as c:
        c.execute('REFRESH MATERIALIZED VIEW CONCURRENTLY ab')
        conn.commit()
conn.close()

# crash1.py
import psycopg2

conn = psycopg2.connect("dbname='jmv' user='jmv' host='/tmp'")

for i in range(10000):
    with conn.cursor() as c:
        c.execute('GRANT ALL ON ALL TABLES IN SCHEMA public TO web_admin')
        c.execute('GRANT ALL ON ALL TABLES IN SCHEMA public TO django')
        conn.commit()
conn.close()

4. Run both python scripts simultaneously.

Now, one or another script will soon fail with the 'tuple concurrently
updated' error.

I've managed to compile PosgreSQL 14.0 with debug information and get
tracebacks from both backends when this happens:

Breakpoint 1, simple_heap_update (relation=relation@entry=0x7f544eb41e08,
otid=otid@entry=0x5595c1176a9c, tup=tup@entry=0x5595c1176a98) at
heapam.c:4160
4160    {
(gdb) bt
#0  simple_heap_update (relation=relation@entry=0x7f544eb41e08,
otid=otid@entry=0x5595c1176a9c, tup=tup@entry=0x5595c1176a98) at
heapam.c:4160
#1  0x00005595bee8deac in CatalogTupleUpdate (heapRel=0x7f544eb41e08,
otid=0x5595c1176a9c, tup=0x5595c1176a98) at indexing.c:309
#2  0x00005595bee7d3bb in ExecGrant_Relation (istmt=0x7ffdfc9c9540) at
aclchk.c:1997
#3  0x00005595bee7dc8b in ExecGrantStmt_oids (istmt=0x7ffdfc9c9540) at
aclchk.c:570
#4  0x00005595bee7f669 in ExecuteGrantStmt (stmt=stmt@entry=0x5595c1153130)
at aclchk.c:555
#5  0x00005595bf1015ce in ProcessUtilitySlow (pstate=0x5595c1174300,
pstmt=0x5595c11534a0,
    queryString=0x5595c1152570 "GRANT ALL ON ALL TABLES IN SCHEMA public TO
web_admin", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
qc=0x7ffdfc9c9bf0,
    dest=<optimized out>) at utility.c:1788
#6  0x00005595bf100393 in standard_ProcessUtility (pstmt=0x5595c11534a0,
queryString=0x5595c1152570 "GRANT ALL ON ALL TABLES IN SCHEMA public TO
web_admin",
    readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL,
params=0x0, queryEnv=0x0, dest=0x5595c1153590, qc=0x7ffdfc9c9bf0) at
utility.c:1066
#7  0x00005595bf0feac1 in PortalRunUtility
(portal=portal@entry=0x5595c11b46c0, pstmt=pstmt@entry=0x5595c11534a0,
isTopLevel=isTopLevel@entry=true,
    setHoldSnapshot=setHoldSnapshot@entry=false,
dest=dest@entry=0x5595c1153590, qc=qc@entry=0x7ffdfc9c9bf0) at
pquery.c:1147
#8  0x00005595bf0febfd in PortalRunMulti
(portal=portal@entry=0x5595c11b46c0, isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false,
    dest=dest@entry=0x5595c1153590, altdest=altdest@entry=0x5595c1153590,
qc=qc@entry=0x7ffdfc9c9bf0) at pquery.c:1304
#9  0x00005595bf0ff291 in PortalRun (portal=portal@entry=0x5595c11b46c0,
count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true,
    run_once=run_once@entry=true, dest=dest@entry=0x5595c1153590,
altdest=altdest@entry=0x5595c1153590, qc=0x7ffdfc9c9bf0) at pquery.c:786
#10 0x00005595bf0fb01b in exec_simple_query (query_string=0x5595c1152570
"GRANT ALL ON ALL TABLES IN SCHEMA public TO web_admin") at
postgres.c:1214
#11 0x00005595bf0fcc1d in PostgresMain (argc=argc@entry=1,
argv=argv@entry=0x7ffdfc9ca060, dbname=<optimized out>, username=<optimized
out>) at postgres.c:4486
#12 0x00005595bf069f0d in BackendRun (port=0x5595c1175cd0,
port=0x5595c1175cd0) at postmaster.c:4506
#13 BackendStartup (port=0x5595c1175cd0) at postmaster.c:4228
#14 ServerLoop () at postmaster.c:1745
#15 0x00005595bf06aeb0 in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x5595c114cd10) at postmaster.c:1417
#16 0x00005595beda5b8e in main (argc=3, argv=0x5595c114cd10) at main.c:209

Breakpoint 3, simple_heap_update (relation=relation@entry=0x7f544eb75ef0,
otid=otid@entry=0x5595c11767cc, tup=tup@entry=0x5595c11767c8) at
heapam.c:4160
4160    {
(gdb) bt
#0  simple_heap_update (relation=relation@entry=0x7f544eb75ef0,
otid=otid@entry=0x5595c11767cc, tup=tup@entry=0x5595c11767c8) at
heapam.c:4160
#1  0x00005595bee8deac in CatalogTupleUpdate (heapRel=0x7f544eb75ef0,
otid=0x5595c11767cc, tup=0x5595c11767c8) at indexing.c:309
#2  0x00005595bef18948 in SetMatViewPopulatedState (relation=0x7f544eac2bd8,
newstate=<optimized out>) at matview.c:105
#3  0x00005595bef18ad0 in ExecRefreshMatView
(stmt=stmt@entry=0x5595c1153000,
    queryString=queryString@entry=0x5595c1152570 "REFRESH MATERIALIZED VIEW
CONCURRENTLY ab", params=params@entry=0x0,
    qc=qc@entry=0x7ffdfc9c9bf0) at matview.c:269
#4  0x00005595bf10132e in ProcessUtilitySlow (pstate=0x5595c1174300,
pstmt=0x5595c1153340,
    queryString=0x5595c1152570 "REFRESH MATERIALIZED VIEW CONCURRENTLY ab",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
    qc=0x7ffdfc9c9bf0, dest=<optimized out>) at utility.c:1675
#5  0x00005595bf100393 in standard_ProcessUtility (pstmt=0x5595c1153340,
    queryString=0x5595c1152570 "REFRESH MATERIALIZED VIEW CONCURRENTLY ab",
readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL,
    params=0x0, queryEnv=0x0, dest=0x5595c1153430, qc=0x7ffdfc9c9bf0) at
utility.c:1066
#6  0x00005595bf0feac1 in PortalRunUtility
(portal=portal@entry=0x5595c11b46c0, pstmt=pstmt@entry=0x5595c1153340,
    isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x5595c1153430,
qc=qc@entry=0x7ffdfc9c9bf0)
    at pquery.c:1147
#7  0x00005595bf0febfd in PortalRunMulti
(portal=portal@entry=0x5595c11b46c0, isTopLevel=isTopLevel@entry=true,
    setHoldSnapshot=setHoldSnapshot@entry=false,
dest=dest@entry=0x5595c1153430, altdest=altdest@entry=0x5595c1153430,
    qc=qc@entry=0x7ffdfc9c9bf0) at pquery.c:1304
#8  0x00005595bf0ff291 in PortalRun (portal=portal@entry=0x5595c11b46c0,
count=count@entry=9223372036854775807,
    isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true,
dest=dest@entry=0x5595c1153430, altdest=altdest@entry=0x5595c1153430,
    qc=0x7ffdfc9c9bf0) at pquery.c:786
#9  0x00005595bf0fb01b in exec_simple_query (query_string=0x5595c1152570
"REFRESH MATERIALIZED VIEW CONCURRENTLY ab") at postgres.c:1214
#10 0x00005595bf0fcc1d in PostgresMain (argc=argc@entry=1,
argv=argv@entry=0x7ffdfc9ca060, dbname=<optimized out>, username=<optimized
out>)
    at postgres.c:4486
#11 0x00005595bf069f0d in BackendRun (port=0x5595c1173b00,
port=0x5595c1173b00) at postmaster.c:4506
#12 BackendStartup (port=0x5595c1173b00) at postmaster.c:4228
#13 ServerLoop () at postmaster.c:1745
#14 0x00005595bf06aeb0 in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x5595c114cd10) at postmaster.c:1417
#15 0x00005595beda5b8e in main (argc=3, argv=0x5595c114cd10) at main.c:209


Yes, the code is a bit weird, but I would expect a readable error or
deadlock, not the 'tuple concurrently updated' error that looks like
PostgreSQL internal error

Thank you,
Eugene


Re: BUG #17231: ERROR: tuple concurrently updated

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> [ this: ]
>         c.execute('REFRESH MATERIALIZED VIEW CONCURRENTLY ab')
> [ concurrently with this: ]
>         c.execute('GRANT ALL ON ALL TABLES IN SCHEMA public TO web_admin')
> [ results in "tuple concurrently updated" ]

Yeah, that's not very surprising, because both commands need to update
ab's pg_class row.

We could only really prevent this by having GRANT take an exclusive
lock on the object(s) it's altering the ACLs of.  That cure would
likely be worse than the disease, especially for variants like
GRANT ON ALL TABLES, which'd need to ex-lock quite a lot of objects,
probably causing deadlock failures.  So I'm not sure that there's
any good way to improve this.

(There are, BTW, probably plenty of ways to cause this type of
failure when you do concurrent DDL commands on a single object.)

            regards, tom lane