Обсуждение: Major Version Upgrade failure due to orphan roles entries in catalog
Hi,
We have identified an issue causing upgrade failures. The following steps detail how to reproduce the issue:
Create an orphan role entry
/* Postgres version:: PostgreSQL 16.6 */
/* The same can be reproduced in version 17 as well */
create role my_group;
create role dropped_member;
begin;
grant my_group to dropped_member;
OTHER SESSION: drop role dropped_member;
BACK IN ORIGINAL SESSION:
commit;
create role dropped_member;
begin;
grant my_group to dropped_member;
OTHER SESSION: drop role dropped_member;
BACK IN ORIGINAL SESSION:
commit;
Upgrade to Postgres v17
And the upgrade fails with an error :
GRANT "my_group" TO "" WITH INHERIT TRUE GRANTED BY "postgres";
ERROR: zero-length delimited identifier at or near """"
ERROR: zero-length delimited identifier at or near """"
The issue seems to be coming from pg_dumpall for building grants during pg_upgrade.
-Virender
On Tue, 2025-02-11 at 15:32 +0530, Virender Singla wrote: > We have identified an issue causing upgrade failures. The following steps detail how to reproduce the issue: > > Create an orphan role entry > > /* Postgres version:: PostgreSQL 16.6 */ > /* The same can be reproduced in version 17 as well */ > > create role my_group; > create role dropped_member; > begin; > grant my_group to dropped_member; > OTHER SESSION: drop role dropped_member; > BACK IN ORIGINAL SESSION: > commit; > > Upgrade to Postgres v17 > > And the upgrade fails with an error : > > GRANT "my_group" TO "" WITH INHERIT TRUE GRANTED BY "postgres"; > ERROR: zero-length delimited identifier at or near """" > > The issue seems to be coming from pg_dumpall for building grants during pg_upgrade. > > https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_dumpall.c#L992 I agree that that is a bug. I guess the GRANT statement should put a FOR KEY SHARE lock on the pg_authid row for "dropped_member", similar to what we do for foreign keys. Yours, Laurenz Albe -- *E-Mail Disclaimer* Der Inhalt dieser E-Mail ist ausschliesslich fuer den bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass jede Form der Kenntnisnahme, Veroeffentlichung, Vervielfaeltigung oder Weitergabe des Inhalts dieser E-Mail unzulaessig ist. Wir bitten Sie, sich in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen. *CONFIDENTIALITY NOTICE & DISCLAIMER *This message and any attachment are confidential and may be privileged or otherwise protected from disclosure and solely for the use of the person(s) or entity to whom it is intended. If you have received this message in error and are not the intended recipient, please notify the sender immediately and delete this message and any attachment from your system. If you are not the intended recipient, be advised that any use of this message is prohibited and may be unlawful, and you must not copy this message or attachment or disclose the contents to any other person.
BTW a similar variant has been fixed in PG 16 but does not fix the above case.
create role dropped_group;
create role member;
begin;
grant dropped_group to member;
OTHER SESSION: drop role dropped_group;
BACK IN ORIGINAL SESSION:
commit;
On Tue, Feb 11, 2025 at 7:50 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2025-02-11 at 15:32 +0530, Virender Singla wrote:
> We have identified an issue causing upgrade failures. The following steps detail how to reproduce the issue:
>
> Create an orphan role entry
>
> /* Postgres version:: PostgreSQL 16.6 */
> /* The same can be reproduced in version 17 as well */
>
> create role my_group;
> create role dropped_member;
> begin;
> grant my_group to dropped_member;
> OTHER SESSION: drop role dropped_member;
> BACK IN ORIGINAL SESSION:
> commit;
>
> Upgrade to Postgres v17
>
> And the upgrade fails with an error :
>
> GRANT "my_group" TO "" WITH INHERIT TRUE GRANTED BY "postgres";
> ERROR: zero-length delimited identifier at or near """"
>
> The issue seems to be coming from pg_dumpall for building grants during pg_upgrade.
>
> https://github.com/postgres/postgres/blob/master/src/bin/pg_dump/pg_dumpall.c#L992
I agree that that is a bug.
I guess the GRANT statement should put a FOR KEY SHARE lock on the pg_authid row
for "dropped_member", similar to what we do for foreign keys.
Yours,
Laurenz Albe
--
*E-Mail Disclaimer*
Der Inhalt dieser E-Mail ist ausschliesslich fuer den
bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat
dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte,
dass jede Form der Kenntnisnahme, Veroeffentlichung, Vervielfaeltigung oder
Weitergabe des Inhalts dieser E-Mail unzulaessig ist. Wir bitten Sie, sich
in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen.
*CONFIDENTIALITY NOTICE & DISCLAIMER
*This message and any attachment are
confidential and may be privileged or otherwise protected from disclosure
and solely for the use of the person(s) or entity to whom it is intended.
If you have received this message in error and are not the intended
recipient, please notify the sender immediately and delete this message and
any attachment from your system. If you are not the intended recipient, be
advised that any use of this message is prohibited and may be unlawful, and
you must not copy this message or attachment or disclose the contents to
any other person.
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Tue, 2025-02-11 at 15:32 +0530, Virender Singla wrote:
>> /* Postgres version:: PostgreSQL 16.6 */
>> /* The same can be reproduced in version 17 as well */
>>
>> create role my_group;
>> create role dropped_member;
>> begin;
>> grant my_group to dropped_member;
>> OTHER SESSION: drop role dropped_member;
>> BACK IN ORIGINAL SESSION:
>> commit;
[ leaves a dangling pg_auth_members entry behind ]
> I guess the GRANT statement should put a FOR KEY SHARE lock on the pg_authid row
> for "dropped_member", similar to what we do for foreign keys.
There is a lock taken already, which is why the DROP ROLE blocks.
What there is not is a recheck that the role still exists once
we have its lock.
I poked into this, and in code terms it seems like the problem is
that AddRoleMems adds a pg_auth_members entry but makes it depend
on only one of the three roles listed in the entry. That seems
wrong on its face. The connection to this problem is that the
dependency-adding code would take care of the lock+recheck, if
it only knew that the pg_auth_members entry ought to be treated
as depending on dropped_member.
The code the attached patch is touching is from commit 6566133c5.
I'm not going to blame the bug on that commit, because before that
we had dependencies on *zero* of the three roles; at least it added
one on the grantor. But I'm wondering whether Robert thought about
the other two roles and explicitly rejected making dependencies
for them, and if so why.
In addition to fixing the described race condition, this patch
means that DROP OWNED BY on either the granted role or the member
will drop the grant. This seems consistent with the goals of
6566133c5 and with our general approach to dropping privileges
during DROP OWNED BY; but it didn't happen that way before.
I'm wondering a little bit if we could simplify/remove some of the
code in user.c by relying on processing of these dependency entries
to carry the load instead during DROP ROLE. But it passes check-world
as-is, so maybe leaving well enough alone is best.
regards, tom lane
diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c
index 0db174e6f1..0c84886e82 100644
--- a/src/backend/commands/user.c
+++ b/src/backend/commands/user.c
@@ -30,6 +30,7 @@
#include "commands/defrem.h"
#include "commands/seclabel.h"
#include "commands/user.h"
+#include "lib/qunique.h"
#include "libpq/crypt.h"
#include "miscadmin.h"
#include "storage/lmgr.h"
@@ -489,7 +490,7 @@ CreateRole(ParseState *pstate, CreateRoleStmt *stmt)
* Advance command counter so we can see new record; else tests in
* AddRoleMems may fail.
*/
- if (addroleto || adminmembers || rolemembers)
+ if (addroleto || adminmembers || rolemembers || !superuser())
CommandCounterIncrement();
/* Default grant. */
@@ -1904,7 +1905,8 @@ AddRoleMems(Oid currentUserId, const char *rolename, Oid roleid,
else
{
Oid objectId;
- Oid *newmembers = palloc(sizeof(Oid));
+ Oid *newmembers = (Oid *) palloc(3 * sizeof(Oid));
+ int nnewmembers;
/*
* The values for these options can be taken directly from 'popt'.
@@ -1946,12 +1948,22 @@ AddRoleMems(Oid currentUserId, const char *rolename, Oid roleid,
new_record, new_record_nulls);
CatalogTupleInsert(pg_authmem_rel, tuple);
- /* updateAclDependencies wants to pfree array inputs */
- newmembers[0] = grantorId;
+ /*
+ * Record dependencies on the roleid, member, and grantor, as if a
+ * pg_auth_members entry were an object ACL.
+ * updateAclDependencies() requires an input array that is
+ * palloc'd (it will free it), sorted, and de-duped.
+ */
+ newmembers[0] = roleid;
+ newmembers[1] = memberid;
+ newmembers[2] = grantorId;
+ qsort(newmembers, 3, sizeof(Oid), oid_cmp);
+ nnewmembers = qunique(newmembers, 3, sizeof(Oid), oid_cmp);
+
updateAclDependencies(AuthMemRelationId, objectId,
0, InvalidOid,
0, NULL,
- 1, newmembers);
+ nnewmembers, newmembers);
}
/* CCI after each change, in case there are duplicates in list */
On 2025-Feb-11, Virender Singla wrote:
> And the upgrade fails with an error :
>
>
> *GRANT "my_group" TO "" WITH INHERIT TRUE GRANTED BY "postgres";ERROR:
> zero-length delimited identifier at or near """"*
>
> The issue seems to be coming from pg_dumpall for building grants during
> pg_upgrade.
Hmm, I think fixing the bug as Tom suggests downthread is probably a
good idea, but I think we should in addition change pg_dumpall to avoid
printing a GRANT line if there's no grantee. Maybe turning one of these LEFT
JOINs into a regular inner join is a sufficient fix for that:
/* Generate and execute query. */
printfPQExpBuffer(buf, "SELECT ur.rolname AS role, "
"um.rolname AS member, "
"ug.oid AS grantorid, "
"ug.rolname AS grantor, "
"a.admin_option");
if (dump_grant_options)
appendPQExpBufferStr(buf, ", a.inherit_option, a.set_option");
appendPQExpBuffer(buf, " FROM pg_auth_members a "
"LEFT JOIN %s ur on ur.oid = a.roleid "
"LEFT JOIN %s um on um.oid = a.member "
"LEFT JOIN %s ug on ug.oid = a.grantor "
"WHERE NOT (ur.rolname ~ '^pg_' AND um.rolname ~ '^pg_')"
"ORDER BY 1,2,4", role_catalog, role_catalog, role_catalog);
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"I am amazed at [the pgsql-sql] mailing list for the wonderful support, and
lack of hesitasion in answering a lost soul's question, I just wished the rest
of the mailing list could be like this." (Fotis)
https://postgr.es/m/200606261359.k5QDxE2p004593@auth-smtp.hol.gr
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@alvh.no-ip.org> writes:
> Hmm, I think fixing the bug as Tom suggests downthread is probably a
> good idea, but I think we should in addition change pg_dumpall to avoid
> printing a GRANT line if there's no grantee.
On the one hand, my proposed patch can do nothing to fix existing
dangling entries in pg_auth_members, so hacking pg_dump seems like
a good workaround if the problem already exists. On the other hand,
if we make pg_dump do that then we won't detect future problems of
the same ilk.
> Maybe turning one of these LEFT
> JOINs into a regular inner join is a sufficient fix for that:
Probably change all three, if we're to do this at all.
regards, tom lane
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@alvh.no-ip.org> writes:
> Hmm, I think fixing the bug as Tom suggests downthread is probably a
> good idea, but I think we should in addition change pg_dumpall to avoid
> printing a GRANT line if there's no grantee. Maybe turning one of these LEFT
> JOINs into a regular inner join is a sufficient fix for that:
After looking at this I thought it was worth a little more code to warn
about the dangling role OID, instead of just silently ignoring it.
Here's a couple of more-polished patches.
I'm unsure whether to back-patch the 0001 patch, as it does imply
more pg_shdepend entries than we have today, so it's sort of a
backdoor catalog change. But we're mostly interested in the
transient behavior of having a lock+recheck during entry insertion,
so maybe it's fine. 0002 should be back-patched in any case.
(BTW, I was distressed to learn from the code coverage report
that we have zero test coverage of the hardly-trivial logic in
dumpRoleMembership. I didn't try to address that here. I did
test this new logic by dint of manually deleting from pg_authid.)
regards, tom lane
From b8f0dff492cfe689e8516b0353d004f483838640 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 20 Feb 2025 15:34:55 -0500
Subject: [PATCH v1 1/2] Avoid race condition between "GRANT role" and "DROP
ROLE".
Concurrently dropping either the granted role or the grantee
does not stop GRANT from completing, instead resulting in a
dangling role reference in pg_auth_members. That's relatively
harmless in the short run, but it does result in garbage
output from pg_dumpall and therefore problems for pg_upgrade.
This patch deals with the problem by adding the granted and grantee
roles as explicit shared dependencies of the pg_auth_members entry.
That's a bit indirect, but it works because the pg_shdepend code
applies the necessary locking and rechecking.
Commit 6566133c5 previously established similar handling for
the grantor column of pg_auth_members; it's not clear why it
didn't cover the other two role OID columns.
A side-effect of this approach is that DROP OWNED BY will now
drop pg_auth_members entries that mention the target role as
either the granted or grantee role. That's clearly appropriate
for the grantee, and it doesn't seem too far out of line for
the granted role. (One could argue that this makes DropRole's
code to auto-drop pg_auth_members entries unnecessary, but
I chose to leave it in place since perhaps some people's
workflows expect that to work without a DROP OWNED BY.)
Reported-by: Virender Singla <virender.cse@gmail.com>
Discussion: https://postgr.es/m/CAM6Zo8woa62ZFHtMKox6a4jb8qQ=w87R2L0K8347iE-juQL2EA@mail.gmail.com
Backpatch-through: TBD
---
doc/src/sgml/ref/drop_owned.sgml | 2 +-
src/backend/commands/user.c | 22 +++++++++++++----
src/test/regress/expected/privileges.out | 30 ++++++++++++++++++++++++
src/test/regress/sql/privileges.sql | 15 ++++++++++++
4 files changed, 63 insertions(+), 6 deletions(-)
diff --git a/doc/src/sgml/ref/drop_owned.sgml b/doc/src/sgml/ref/drop_owned.sgml
index efda01a39e..46e1c229ec 100644
--- a/doc/src/sgml/ref/drop_owned.sgml
+++ b/doc/src/sgml/ref/drop_owned.sgml
@@ -33,7 +33,7 @@ DROP OWNED BY { <replaceable class="parameter">name</replaceable> | CURRENT_ROLE
database that are owned by one of the specified roles. Any
privileges granted to the given roles on objects in the current
database or on shared objects (databases, tablespaces, configuration
- parameters) will also be revoked.
+ parameters, or other roles) will also be revoked.
</para>
</refsect1>
diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c
index 0db174e6f1..0c84886e82 100644
--- a/src/backend/commands/user.c
+++ b/src/backend/commands/user.c
@@ -30,6 +30,7 @@
#include "commands/defrem.h"
#include "commands/seclabel.h"
#include "commands/user.h"
+#include "lib/qunique.h"
#include "libpq/crypt.h"
#include "miscadmin.h"
#include "storage/lmgr.h"
@@ -489,7 +490,7 @@ CreateRole(ParseState *pstate, CreateRoleStmt *stmt)
* Advance command counter so we can see new record; else tests in
* AddRoleMems may fail.
*/
- if (addroleto || adminmembers || rolemembers)
+ if (addroleto || adminmembers || rolemembers || !superuser())
CommandCounterIncrement();
/* Default grant. */
@@ -1904,7 +1905,8 @@ AddRoleMems(Oid currentUserId, const char *rolename, Oid roleid,
else
{
Oid objectId;
- Oid *newmembers = palloc(sizeof(Oid));
+ Oid *newmembers = (Oid *) palloc(3 * sizeof(Oid));
+ int nnewmembers;
/*
* The values for these options can be taken directly from 'popt'.
@@ -1946,12 +1948,22 @@ AddRoleMems(Oid currentUserId, const char *rolename, Oid roleid,
new_record, new_record_nulls);
CatalogTupleInsert(pg_authmem_rel, tuple);
- /* updateAclDependencies wants to pfree array inputs */
- newmembers[0] = grantorId;
+ /*
+ * Record dependencies on the roleid, member, and grantor, as if a
+ * pg_auth_members entry were an object ACL.
+ * updateAclDependencies() requires an input array that is
+ * palloc'd (it will free it), sorted, and de-duped.
+ */
+ newmembers[0] = roleid;
+ newmembers[1] = memberid;
+ newmembers[2] = grantorId;
+ qsort(newmembers, 3, sizeof(Oid), oid_cmp);
+ nnewmembers = qunique(newmembers, 3, sizeof(Oid), oid_cmp);
+
updateAclDependencies(AuthMemRelationId, objectId,
0, InvalidOid,
0, NULL,
- 1, newmembers);
+ nnewmembers, newmembers);
}
/* CCI after each change, in case there are duplicates in list */
diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out
index 6b01313101..a76256405f 100644
--- a/src/test/regress/expected/privileges.out
+++ b/src/test/regress/expected/privileges.out
@@ -113,6 +113,36 @@ CREATE USER regress_priv_user2;
CREATE USER regress_priv_user3;
CREATE USER regress_priv_user4;
CREATE USER regress_priv_user5;
+-- DROP OWNED should also act on granted and granted-to roles
+GRANT regress_priv_user1 TO regress_priv_user2;
+GRANT regress_priv_user2 TO regress_priv_user3;
+SELECT roleid::regrole, member::regrole FROM pg_auth_members
+ WHERE roleid IN ('regress_priv_user1'::regrole,'regress_priv_user2'::regrole)
+ ORDER BY roleid::regrole::text;
+ roleid | member
+--------------------+--------------------
+ regress_priv_user1 | regress_priv_user2
+ regress_priv_user2 | regress_priv_user3
+(2 rows)
+
+REASSIGN OWNED BY regress_priv_user2 TO regress_priv_user4; -- no effect
+SELECT roleid::regrole, member::regrole FROM pg_auth_members
+ WHERE roleid IN ('regress_priv_user1'::regrole,'regress_priv_user2'::regrole)
+ ORDER BY roleid::regrole::text;
+ roleid | member
+--------------------+--------------------
+ regress_priv_user1 | regress_priv_user2
+ regress_priv_user2 | regress_priv_user3
+(2 rows)
+
+DROP OWNED BY regress_priv_user2; -- removes both grants
+SELECT roleid::regrole, member::regrole FROM pg_auth_members
+ WHERE roleid IN ('regress_priv_user1'::regrole,'regress_priv_user2'::regrole)
+ ORDER BY roleid::regrole::text;
+ roleid | member
+--------+--------
+(0 rows)
+
GRANT pg_read_all_data TO regress_priv_user6;
GRANT pg_write_all_data TO regress_priv_user7;
GRANT pg_read_all_settings TO regress_priv_user8 WITH ADMIN OPTION;
diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql
index 60e7443bf5..d195aaf137 100644
--- a/src/test/regress/sql/privileges.sql
+++ b/src/test/regress/sql/privileges.sql
@@ -90,6 +90,21 @@ CREATE USER regress_priv_user3;
CREATE USER regress_priv_user4;
CREATE USER regress_priv_user5;
+-- DROP OWNED should also act on granted and granted-to roles
+GRANT regress_priv_user1 TO regress_priv_user2;
+GRANT regress_priv_user2 TO regress_priv_user3;
+SELECT roleid::regrole, member::regrole FROM pg_auth_members
+ WHERE roleid IN ('regress_priv_user1'::regrole,'regress_priv_user2'::regrole)
+ ORDER BY roleid::regrole::text;
+REASSIGN OWNED BY regress_priv_user2 TO regress_priv_user4; -- no effect
+SELECT roleid::regrole, member::regrole FROM pg_auth_members
+ WHERE roleid IN ('regress_priv_user1'::regrole,'regress_priv_user2'::regrole)
+ ORDER BY roleid::regrole::text;
+DROP OWNED BY regress_priv_user2; -- removes both grants
+SELECT roleid::regrole, member::regrole FROM pg_auth_members
+ WHERE roleid IN ('regress_priv_user1'::regrole,'regress_priv_user2'::regrole)
+ ORDER BY roleid::regrole::text;
+
GRANT pg_read_all_data TO regress_priv_user6;
GRANT pg_write_all_data TO regress_priv_user7;
GRANT pg_read_all_settings TO regress_priv_user8 WITH ADMIN OPTION;
--
2.43.5
From 937b1ebb7c8eeb022385c51ec13ea206bd97d2a3 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Thu, 20 Feb 2025 17:07:34 -0500
Subject: [PATCH v1 2/2] Fix pg_dumpall to cope with dangling OIDs in
pg_auth_members.
In view of the bug fixed by the previous patch, we need to allow
for the possibility of role OIDs in pg_auth_members that don't
exist in pg_authid. As pg_dumpall stands, it will emit invalid
commands like 'GRANT foo TO ""', which causes pg_upgrade to fail.
Fix it to emit warnings and skip those GRANTs, instead.
There was some discussion of removing the problem by changing
dumpRoleMembership's query to use JOIN not LEFT JOIN, but that
would result in silently ignoring such entries. It seems better
to produce a warning.
Reported-by: Virender Singla <virender.cse@gmail.com>
Discussion: https://postgr.es/m/CAM6Zo8woa62ZFHtMKox6a4jb8qQ=w87R2L0K8347iE-juQL2EA@mail.gmail.com
Backpatch-through: TBD
---
src/bin/pg_dump/pg_dumpall.c | 66 +++++++++++++++++++++++++++++++-----
1 file changed, 57 insertions(+), 9 deletions(-)
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index b993b05cc2..b9b22c47be 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -986,6 +986,13 @@ dumpRoleMembership(PGconn *conn)
total;
bool dump_grantors;
bool dump_grant_options;
+ int i_role;
+ int i_member;
+ int i_grantor;
+ int i_roleid;
+ int i_memberid;
+ int i_grantorid;
+ int i_admin_option;
int i_inherit_option;
int i_set_option;
@@ -995,6 +1002,10 @@ dumpRoleMembership(PGconn *conn)
* they didn't have ADMIN OPTION on the role, or a user that no longer
* existed. To avoid dump and restore failures, don't dump the grantor
* when talking to an old server version.
+ *
+ * Also, in older versions the roleid and/or member could be role OIDs
+ * that no longer exist. If we find such cases, print a warning and skip
+ * the entry.
*/
dump_grantors = (PQserverVersion(conn) >= 160000);
@@ -1006,8 +1017,10 @@ dumpRoleMembership(PGconn *conn)
/* Generate and execute query. */
printfPQExpBuffer(buf, "SELECT ur.rolname AS role, "
"um.rolname AS member, "
- "ug.oid AS grantorid, "
"ug.rolname AS grantor, "
+ "a.roleid AS roleid, "
+ "a.member AS memberid, "
+ "a.grantor AS grantorid, "
"a.admin_option");
if (dump_grant_options)
appendPQExpBufferStr(buf, ", a.inherit_option, a.set_option");
@@ -1016,8 +1029,15 @@ dumpRoleMembership(PGconn *conn)
"LEFT JOIN %s um on um.oid = a.member "
"LEFT JOIN %s ug on ug.oid = a.grantor "
"WHERE NOT (ur.rolname ~ '^pg_' AND um.rolname ~ '^pg_')"
- "ORDER BY 1,2,4", role_catalog, role_catalog, role_catalog);
+ "ORDER BY 1,2,3", role_catalog, role_catalog, role_catalog);
res = executeQuery(conn, buf->data);
+ i_role = PQfnumber(res, "role");
+ i_member = PQfnumber(res, "member");
+ i_grantor = PQfnumber(res, "grantor");
+ i_roleid = PQfnumber(res, "roleid");
+ i_memberid = PQfnumber(res, "memberid");
+ i_grantorid = PQfnumber(res, "grantorid");
+ i_admin_option = PQfnumber(res, "admin_option");
i_inherit_option = PQfnumber(res, "inherit_option");
i_set_option = PQfnumber(res, "set_option");
@@ -1041,24 +1061,32 @@ dumpRoleMembership(PGconn *conn)
total = PQntuples(res);
while (start < total)
{
- char *role = PQgetvalue(res, start, 0);
+ char *role = PQgetvalue(res, start, i_role);
int i;
bool *done;
int remaining;
int prev_remaining = 0;
rolename_hash *ht;
+ /* If we hit a null roleid, we're done (nulls sort to the end). */
+ if (PQgetisnull(res, start, i_role))
+ {
+ /* translator: %s represents a numeric role OID */
+ pg_log_warning("found dangling pg_auth_members entry for role %s",
+ PQgetvalue(res, start, i_roleid));
+ break;
+ }
+
/* All memberships for a single role should be adjacent. */
for (end = start; end < total; ++end)
{
char *otherrole;
- otherrole = PQgetvalue(res, end, 0);
+ otherrole = PQgetvalue(res, end, i_role);
if (strcmp(role, otherrole) != 0)
break;
}
- role = PQgetvalue(res, start, 0);
remaining = end - start;
done = pg_malloc0(remaining * sizeof(bool));
ht = rolename_create(remaining, NULL);
@@ -1098,10 +1126,30 @@ dumpRoleMembership(PGconn *conn)
if (done[i - start])
continue;
- member = PQgetvalue(res, i, 1);
- grantorid = PQgetvalue(res, i, 2);
- grantor = PQgetvalue(res, i, 3);
- admin_option = PQgetvalue(res, i, 4);
+ /* Complain about, then ignore, entries with dangling OIDs. */
+ if (PQgetisnull(res, i, i_member))
+ {
+ /* translator: %s represents a numeric role OID */
+ pg_log_warning("found dangling pg_auth_members entry for role %s",
+ PQgetvalue(res, i, i_memberid));
+ done[i - start] = true;
+ --remaining;
+ continue;
+ }
+ if (PQgetisnull(res, i, i_grantor))
+ {
+ /* translator: %s represents a numeric role OID */
+ pg_log_warning("found dangling pg_auth_members entry for role %s",
+ PQgetvalue(res, i, i_grantorid));
+ done[i - start] = true;
+ --remaining;
+ continue;
+ }
+
+ member = PQgetvalue(res, i, i_member);
+ grantor = PQgetvalue(res, i, i_grantor);
+ grantorid = PQgetvalue(res, i, i_grantorid);
+ admin_option = PQgetvalue(res, i, i_admin_option);
if (dump_grant_options)
set_option = PQgetvalue(res, i, i_set_option);
--
2.43.5
On Thu, 2025-02-20 at 17:19 -0500, Tom Lane wrote: > After looking at this I thought it was worth a little more code to warn > about the dangling role OID, instead of just silently ignoring it. > Here's a couple of more-polished patches. > > I'm unsure whether to back-patch the 0001 patch, as it does imply > more pg_shdepend entries than we have today, so it's sort of a > backdoor catalog change. But we're mostly interested in the > transient behavior of having a lock+recheck during entry insertion, > so maybe it's fine. 0002 should be back-patched in any case. I'd say that adding new catalog entries in a way that is compatible shouldn't be a problem, but I still wouldn't backpatch the 0001 patch, because it is not necessary. The orphaned pg_auth_members entry didn't cause any harm, and a few warnings more during an upgrade shouldn't be a big problem. I have one question about the first patch: > diff --git a/src/backend/commands/user.c b/src/backend/commands/user.c > index 0db174e6f1..0c84886e82 100644 > --- a/src/backend/commands/user.c > +++ b/src/backend/commands/user.c > @@ -489,7 +490,7 @@ CreateRole(ParseState *pstate, CreateRoleStmt *stmt) > * Advance command counter so we can see new record; else tests in > * AddRoleMems may fail. > */ > - if (addroleto || adminmembers || rolemembers) > + if (addroleto || adminmembers || rolemembers || !superuser()) > CommandCounterIncrement(); > > /* Default grant. */ That change seems unrelated to the problem at hand, and I don't see it mentioned in the commit message. Is that an oversight you fixed on the fly? Apart from that, the patches look fine. Yours, Laurenz Albe -- *E-Mail Disclaimer* Der Inhalt dieser E-Mail ist ausschliesslich fuer den bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass jede Form der Kenntnisnahme, Veroeffentlichung, Vervielfaeltigung oder Weitergabe des Inhalts dieser E-Mail unzulaessig ist. Wir bitten Sie, sich in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen. *CONFIDENTIALITY NOTICE & DISCLAIMER *This message and any attachment are confidential and may be privileged or otherwise protected from disclosure and solely for the use of the person(s) or entity to whom it is intended. If you have received this message in error and are not the intended recipient, please notify the sender immediately and delete this message and any attachment from your system. If you are not the intended recipient, be advised that any use of this message is prohibited and may be unlawful, and you must not copy this message or attachment or disclose the contents to any other person.
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> I have one question about the first patch:
>> - if (addroleto || adminmembers || rolemembers)
>> + if (addroleto || adminmembers || rolemembers || !superuser())
>> CommandCounterIncrement();
> That change seems unrelated to the problem at hand, and I don't see it
> mentioned in the commit message. Is that an oversight you fixed on the
> fly?
Well, kinda, because the patch doesn't work without it. The
problematic case is where none of those 3 flags are set and also
!superuser, so that we decide we need the default grant implemented a
few lines further down. That grant now has an explicit reference to
the new roleid, and if we haven't CommandCounterIncrement'ed, the
pg_shdepend code will error out because it doesn't see the catalog
entry for roleid.
regards, tom lane
On Fri, 2025-02-21 at 09:41 -0500, Tom Lane wrote: > Laurenz Albe <laurenz.albe@cybertec.at> writes: > > I have one question about the first patch: > > > > - if (addroleto || adminmembers || rolemembers) > > > + if (addroleto || adminmembers || rolemembers || !superuser()) > > > CommandCounterIncrement(); > > > That change seems unrelated to the problem at hand, and I don't see it > > mentioned in the commit message. Is that an oversight you fixed on the > > fly? > > Well, kinda, because the patch doesn't work without it. The > problematic case is where none of those 3 flags are set and also > !superuser, so that we decide we need the default grant implemented a > few lines further down. That grant now has an explicit reference to > the new roleid, and if we haven't CommandCounterIncrement'ed, the > pg_shdepend code will error out because it doesn't see the catalog > entry for roleid. Thanks for the explanation. That might be worth a comment. Yours, Laurenz Albe -- *E-Mail Disclaimer* Der Inhalt dieser E-Mail ist ausschliesslich fuer den bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass jede Form der Kenntnisnahme, Veroeffentlichung, Vervielfaeltigung oder Weitergabe des Inhalts dieser E-Mail unzulaessig ist. Wir bitten Sie, sich in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen. *CONFIDENTIALITY NOTICE & DISCLAIMER *This message and any attachment are confidential and may be privileged or otherwise protected from disclosure and solely for the use of the person(s) or entity to whom it is intended. If you have received this message in error and are not the intended recipient, please notify the sender immediately and delete this message and any attachment from your system. If you are not the intended recipient, be advised that any use of this message is prohibited and may be unlawful, and you must not copy this message or attachment or disclose the contents to any other person.
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> Thanks for the explanation. That might be worth a comment.
The adjacent comment already says
/*
* Advance command counter so we can see new record; else tests in
* AddRoleMems may fail.
*/
so I didn't see anything to add there. Maybe "We can skip this in
cases where we will not call AddRoleMems"? Or maybe the better answer
is to conclude that the whole idea of not calling
CommandCounterIncrement unconditionally is too fragile and not worth
expending brain cells on, and just rip out the if-test.
regards, tom lane
On Fri, 2025-02-21 at 11:31 -0500, Tom Lane wrote: > Laurenz Albe <laurenz.albe@cybertec.at> writes: > > Thanks for the explanation. That might be worth a comment. > > The adjacent comment already says > > /* > * Advance command counter so we can see new record; else tests in > * AddRoleMems may fail. > */ > > so I didn't see anything to add there. Maybe "We can skip this in > cases where we will not call AddRoleMems"? Or maybe the better answer > is to conclude that the whole idea of not calling > CommandCounterIncrement unconditionally is too fragile and not worth > expending brain cells on, and just rip out the if-test. Both the extra sentence and the simplification feel like an improvement. I am fine with either. Yours, Laurenz Albe -- *E-Mail Disclaimer* Der Inhalt dieser E-Mail ist ausschliesslich fuer den bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte, dass jede Form der Kenntnisnahme, Veroeffentlichung, Vervielfaeltigung oder Weitergabe des Inhalts dieser E-Mail unzulaessig ist. Wir bitten Sie, sich in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen. *CONFIDENTIALITY NOTICE & DISCLAIMER *This message and any attachment are confidential and may be privileged or otherwise protected from disclosure and solely for the use of the person(s) or entity to whom it is intended. If you have received this message in error and are not the intended recipient, please notify the sender immediately and delete this message and any attachment from your system. If you are not the intended recipient, be advised that any use of this message is prohibited and may be unlawful, and you must not copy this message or attachment or disclose the contents to any other person.
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Fri, 2025-02-21 at 11:31 -0500, Tom Lane wrote:
>> ... Or maybe the better answer
>> is to conclude that the whole idea of not calling
>> CommandCounterIncrement unconditionally is too fragile and not worth
>> expending brain cells on, and just rip out the if-test.
> Both the extra sentence and the simplification feel like an improvement.
> I am fine with either.
The more I think about it the more I like just getting rid of the
test. It'll likely break with every future change to this logic,
until somebody finally gives up on it; so why not now?
I'll make it so. Thanks for reviewing!
regards, tom lane
On Thu, Feb 13, 2025 at 12:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > The code the attached patch is touching is from commit 6566133c5. > I'm not going to blame the bug on that commit, because before that > we had dependencies on *zero* of the three roles; at least it added > one on the grantor. But I'm wondering whether Robert thought about > the other two roles and explicitly rejected making dependencies > for them, and if so why. For some reason, I came to the conclusion that it wasn't needed. Apparently, that was incorrect. -- Robert Haas EDB: http://www.enterprisedb.com
On Thu, Feb 20, 2025 at 5:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm unsure whether to back-patch the 0001 patch, as it does imply
> more pg_shdepend entries than we have today, so it's sort of a
> backdoor catalog change. But we're mostly interested in the
> transient behavior of having a lock+recheck during entry insertion,
> so maybe it's fine. 0002 should be back-patched in any case.
I recently learned of a case where this commit caused role grants to
be erroneously emitted from the output of pg_dumpall. In the case in
question, a v16 pg_dumpall was used against an older server. Hence,
dump_grantors was false, and any generated GRANT commands would not
have included in the grantor anyway. Nevertheless, this logic caused
those grants to be skipped altogether:
+ if (PQgetisnull(res, i, i_grantor))
+ {
+ /* translator: %s represents a numeric role OID */
+ pg_log_warning("found orphaned pg_auth_members
entry for role %s",
+ PQgetvalue(res, i, i_grantorid));
+ done[i - start] = true;
+ --remaining;
+ continue;
+ }
I don't think this logic makes sense. In pre-16 releases, we don't
even try to maintain the grantor field properly. Consider this test
case:
create role foo;
create role bar;
create role baz createrole;
set role baz;
grant foo to bar;
reset role;
drop role baz;
If you do this on v15 and then run v15's pg_dumpall, it will dump
"GRANT foo to bar", with no GRANTOR clause due to the PQgetisnull()
gating that logic. v16's pg_dumpall will dump nothing and emit a
warning instead. Arguably, pre-v16 pg_dumpall shouldn't EVER be
dumping the grantor since the grantorid could be an old role OID that
has been recycled for a new role, and relying on that for anything
security-critical seems like a mistake, but that behavior is also
longstanding. But omitting the grant altogether seems like an
overreaction. I understand that we need to do that when the *member*
is invalid, of course; in that case, there's no alternative. But
that's not the case for the grantor.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
> I recently learned of a case where this commit caused role grants to
> be erroneously emitted from the output of pg_dumpall. In the case in
> question, a v16 pg_dumpall was used against an older server. Hence,
> dump_grantors was false, and any generated GRANT commands would not
> have included in the grantor anyway. Nevertheless, this logic caused
> those grants to be skipped altogether:
> ...
> If you do this on v15 and then run v15's pg_dumpall, it will dump
> "GRANT foo to bar", with no GRANTOR clause due to the PQgetisnull()
> gating that logic. v16's pg_dumpall will dump nothing and emit a
> warning instead. Arguably, pre-v16 pg_dumpall shouldn't EVER be
> dumping the grantor since the grantorid could be an old role OID that
> has been recycled for a new role, and relying on that for anything
> security-critical seems like a mistake, but that behavior is also
> longstanding. But omitting the grant altogether seems like an
> overreaction. I understand that we need to do that when the *member*
> is invalid, of course; in that case, there's no alternative. But
> that's not the case for the grantor.
Hmm. As per the commit message,
Pre-v16 branches already coped with dangling grantor OIDs by simply
omitting the GRANTED BY clause. I left that behavior as-is, although
it's somewhat inconsistent with the behavior of later branches.
So what you're saying is that I should have made the later branches
do that also. I guess it's arguably better than dropping the grant
altogether ... but the end result will be that the grant is now
granted by the superuser running the restore, which doesn't seem
very good either.
regards, tom lane
On Wed, Feb 25, 2026 at 10:36 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So what you're saying is that I should have made the later branches
> do that also. I guess it's arguably better than dropping the grant
> altogether ... but the end result will be that the grant is now
> granted by the superuser running the restore, which doesn't seem
> very good either.
It does not, but it seems better than having a v16 pg_dumpall and v15
pg_dumpall produce non-logically-equivalent dumps of the same
database. My instinct is that this test:
if (PQgetisnull(res, i, i_grantor))
Should instead look like this:
if (PQgetisnull(res, i, i_grantor) &&
dump_grantors)
In v16+, if the grantor is not valid, that's unexpected and something
has gone wrong, perhaps due to insufficient locking, or maybe due to
catalog corruption. The warning is a fair response to a
seemingly-corrupted catalog state. But in v15-, this is just business
as usual; there's no particular expectation that the grantor must be a
valid role OID, and IMHO the best thing to do is give the same result
that a pre-v16 pg_dumpall would have produced.
I'm not actually completely confident that I have a fully correct
analysis of this problem. But I do think that it's hard to argue that
dumping the same database with different pg_dumpall versions should
produce logically different results.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
> In v16+, if the grantor is not valid, that's unexpected and something
> has gone wrong, perhaps due to insufficient locking, or maybe due to
> catalog corruption. The warning is a fair response to a
> seemingly-corrupted catalog state. But in v15-, this is just business
> as usual; there's no particular expectation that the grantor must be a
> valid role OID, and IMHO the best thing to do is give the same result
> that a pre-v16 pg_dumpall would have produced.
I don't entirely agree that it's "business as usual": somebody screwed
up to get the database into that state. I'm not here to apportion
blame for that between the user and the database, but I do think it's
appropriate for pg_dumpall to complain that it's facing invalid data.
If you're good with pg_dumpall producing a warning and then emitting
the GRANT with no grantor clause, I will go make that happen.
regards, tom lane
On Wed, Feb 25, 2026 at 11:50 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I don't entirely agree that it's "business as usual": somebody screwed > up to get the database into that state. I'm not here to apportion > blame for that between the user and the database, but I do think it's > appropriate for pg_dumpall to complain that it's facing invalid data. > > If you're good with pg_dumpall producing a warning and then emitting > the GRANT with no grantor clause, I will go make that happen. Well, I guess I don't really see why there should be a warning. I mean, if you're not here to apportion blame between the user and database, then allow me to step in: in v15-, it's entirely the database's fault. It does absolutely zilch to keep that field valid. As I showed in the earlier example, all the user has to do is create a role, do something, and drop the role, and that field is no longer valid. That's about the simplest scenario imaginable, and I don't think any user would expect it to produce a corrupted database, and I think the status quo ante prior to this commit was that we did not consider that to be a corrupted database. If that feels like the wrong answer to you, I suggest that the reason is that you're not used to PostgreSQL code being as horrifically bad as this code was prior to v16. I don't know of anywhere else in the database where we store an OID and do absolutely nothing to ensure its validity. Sure, we've found some holes over the years where we didn't do enough, especially around concurrency, and we've probably all seen examples of TOAST corruption where a value can't be de-TOASTed due to some inconsistency. But those are examples of where we had had some guards in place and they weren't as thorough as they needed to be, or where the user voided the warranty by doing something stupid, or where the storage manhandled the data. But before 6566133c5f52771198aca07ed18f84519fac1be7, we didn't just have inadequate machinery in this area: we had none. I remember being absolutely gobsmacked at this discovery that it had been this way for years and apparently nobody was too fussed about it. It seemed to me then and it seems to me now that such handling was way below our usual quality standards, even for older code when we weren't as strict as we are today. But that's nevertheless how it was. Now, if you go and do as you propose here, and adjust the code so that the grant is dumped but a warning is produced, my fear is that someone upgrading from v15- to v16+ will see that warning and think that there is a problem with their database that needs fixing. But, except for the fact that they should really upgrade to a newer and better version, that's not really true. They're just running a version where no care was put into making that field valid, and so sometimes it isn't. I suppose that is OK in the end: when such a user files a support ticket with $EMPLOYER or any other company, somebody can simply tell that user that the warning requires no corrective action and can be safely ignored. But of course, they'll have to know that this is the case, and not everyone will have read this discussion. Moreover, we'll emit essentially the same warning for the member case, where the warning does point to a problem that someone might want to think about correcting, and exactly the same warning against a v16+ database where it indicates that something has actually gone wrong. I don't know why we would want to do that instead of what I proposed, but if you're dead-set on it, I can live with it: it will at least mean people can upgrade without having perfectly good role grants disappear into the ether. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Feb 25, 2026 at 11:50 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If you're good with pg_dumpall producing a warning and then emitting
>> the GRANT with no grantor clause, I will go make that happen.
> Well, I guess I don't really see why there should be a warning.
Because the result of the restore will not match how things were
in the source database? True, we do not have any way to make them
match, but that doesn't mean that pg_dumpall has fulfilled all
expectations.
> Now, if you go and do as you propose here, and adjust the code so that
> the grant is dumped but a warning is produced, my fear is that someone
> upgrading from v15- to v16+ will see that warning and think that there
> is a problem with their database that needs fixing.
On the other hand, if we produce no warning and yet the restored DB
is unlike the original, that could also be cause for concern.
> Moreover, we'll emit essentially the same warning for the member case,
> where the warning does point to a problem that someone might want to
> think about correcting, and exactly the same warning against a v16+
> database where it indicates that something has actually gone wrong.
That's a fair point, but maybe it could be addressed by phrasing the
message differently for the different cases.
regards, tom lane
On Wed, Feb 25, 2026 at 12:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Because the result of the restore will not match how things were > in the source database? True, we do not have any way to make them > match, but that doesn't mean that pg_dumpall has fulfilled all > expectations. That's a fair point. On the flip side, the grantor in v15- is basically a comment. From v16, it behaves in the way we'd normally expect: a role grant depends on the grantor in the same way that a table grant depends on the grantor in the table ACL, just with a different catalog representation. In v15-, it has no functional impact. > > Moreover, we'll emit essentially the same warning for the member case, > > where the warning does point to a problem that someone might want to > > think about correcting, and exactly the same warning against a v16+ > > database where it indicates that something has actually gone wrong. > > That's a fair point, but maybe it could be addressed by phrasing the > message differently for the different cases. I like that idea. I think the biggest danger here for users is actually that we have to convert the pre-v16 "it's a comment" catalog state into a v16+ "it's a real thing that works like the rest of the authorization system" catalog state, and there's no way to do that perfectly because we're starting from a busted catalog representation. It seems reasonable to somehow let people know that whatever we're doing is an approximation. If we can make the "invalid grantor from pre-v16" sound like "don't panic but we have to patch some stuff up so you can upgrade" and the "invalid grantor for v16+" and "invalid member" cases sound like "uh oh, it seems like something got corrupted" then I'm entirely happy with that state of affairs (assuming we also resume dumping the GRANT, of course). Thanks, -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Feb 25, 2026 at 12:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Moreover, we'll emit essentially the same warning for the member case,
>>> where the warning does point to a problem that someone might want to
>>> think about correcting, and exactly the same warning against a v16+
>>> database where it indicates that something has actually gone wrong.
>> That's a fair point, but maybe it could be addressed by phrasing the
>> message differently for the different cases.
> I like that idea.
OK, so we need to pick wordings. Right now we use this wording for
all three cases:
/* translator: %s represents a numeric role OID */
pg_log_warning("found orphaned pg_auth_members entry for role %s",
PQgetvalue(res, start, i_roleid));
I don't really love that wording for any of these cases, because
(a) "orphaned" isn't a word we use much, and (b) conflating the
role, member, and grantor doesn't seem helpful. How about
something like
missing role:
"ignoring role grant for missing role with OID nnn"
missing member:
"ignoring role grant to missing role with OID nnn"
missing grantor, source version >= 16:
"role grant of R1 to R2 was granted by missing role with OID nnn"
with detail
"We'll dump the GRANT without a GRANTED BY clause, but this shouldn't happen."
missing grantor, source version < 16:
"role grant of R1 to R2 was granted by missing role with OID nnn"
with detail
"This state isn't unusual. We'll dump the GRANT without a GRANTED BY clause."
(We have pg_log_warning_detail back to v16, so it's okay to rely on
a detail message.) I feel like this could use more word-smithing,
but it's covering more or less the right ground IMO.
regards, tom lane
On Wed, Feb 25, 2026 at 1:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > missing grantor, source version >= 16: > > "role grant of R1 to R2 was granted by missing role with OID nnn" > with detail > "We'll dump the GRANT without a GRANTED BY clause, but this shouldn't happen." > > missing grantor, source version < 16: > > "role grant of R1 to R2 was granted by missing role with OID nnn" > with detail > "This state isn't unusual. We'll dump the GRANT without a GRANTED BY clause." > > (We have pg_log_warning_detail back to v16, so it's okay to rely on > a detail message.) I feel like this could use more word-smithing, > but it's covering more or less the right ground IMO. Yeah, I agree both that it's directionally correct and that more word-smithing might help. I feel like your main message is better than your detail, but I'm not sure exactly what would be an improvement. Sometimes I find it helpful to play with the message level and the primary message wording as a way of indicating the gravity, e.g. info: dumping grant of role \"%s\" to role \"%s\" without GRANTED BY, because role with OID %u no longer exists vs. warning: grant of role \"%s\" to \"%s\" has invalid grantor OID %u detail: this grant will be dumped without GRANTED BY -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
> Sometimes I find it helpful to play with the message level and the
> primary message wording as a way of indicating the gravity, e.g.
> info: dumping grant of role \"%s\" to role \"%s\" without GRANTED BY,
> because role with OID %u no longer exists
> vs.
> warning: grant of role \"%s\" to \"%s\" has invalid grantor OID %u
> detail: this grant will be dumped without GRANTED BY
I'd be content to use those wordings (except I think our style
guide wants detail to be punctuated like a sentence). I'll wait
a day or so to see if anyone has a better idea, though.
regards, tom lane
I wrote:
> I'd be content to use those wordings (except I think our style
> guide wants detail to be punctuated like a sentence). I'll wait
> a day or so to see if anyone has a better idea, though.
When I looked at the code more closely, I realized that it already
doesn't dump grantors when dumping from pre-v16. I'm inclined to
think that is an overreaction to the possible unreliability of the
data (and from your comment upthread you might agree). But given
the lack of complaints about it, let's leave that as-is for now.
The immediate problem is that the code is allowing a null grantor to
suppress the GRANT altogether, *even if it's not going to use the
grantor*. Clearly that's silly. But if we're not going to use
the grantor, let's skip trying to fetch it, which means we also
don't need the info-level variant of the message.
So I end with the attached draft patch.
regards, tom lane
#text/x-diff; name="v1-clean-up-dumpRoleMembership.patch" [v1-clean-up-dumpRoleMembership.patch]
/home/tgl/pgsql/v1-clean-up-dumpRoleMembership.patch
I wrote:
> So I end with the attached draft patch.
Sigh, this time with it really attached.
regards, tom lane
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index 65f8e3a41f1..3062636a2ce 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -1292,7 +1292,7 @@ dumpRoleMembership(PGconn *conn)
* that no longer exist. If we find such cases, print a warning and skip
* the entry.
*/
- dump_grantors = (PQserverVersion(conn) >= 160000);
+ dump_grantors = (server_version >= 160000);
/*
* Previous versions of PostgreSQL also did not have grant-level options.
@@ -1357,7 +1357,7 @@ dumpRoleMembership(PGconn *conn)
if (PQgetisnull(res, start, i_role))
{
/* translator: %s represents a numeric role OID */
- pg_log_warning("found orphaned pg_auth_members entry for role %s",
+ pg_log_warning("ignoring role grant for missing role with OID %s",
PQgetvalue(res, start, i_roleid));
break;
}
@@ -1374,6 +1374,11 @@ dumpRoleMembership(PGconn *conn)
remaining = end - start;
done = pg_malloc0_array(bool, remaining);
+
+ /*
+ * We use a hashtable to track the member names that have been granted
+ * admin option. Usually a hashtable is overkill, but sometimes not.
+ */
ht = rolename_create(remaining, NULL);
/*
@@ -1401,50 +1406,56 @@ dumpRoleMembership(PGconn *conn)
for (i = start; i < end; ++i)
{
char *member;
- char *admin_option;
char *grantorid;
- char *grantor;
+ char *grantor = NULL;
+ bool dump_grantor = dump_grantors;
char *set_option = "true";
+ char *admin_option;
bool found;
/* If we already did this grant, don't do it again. */
if (done[i - start])
continue;
- /* Complain about, then ignore, entries with orphaned OIDs. */
+ /* Complain about, then ignore, entries for unknown members. */
if (PQgetisnull(res, i, i_member))
{
/* translator: %s represents a numeric role OID */
- pg_log_warning("found orphaned pg_auth_members entry for role %s",
+ pg_log_warning("ignoring role grant to missing role with OID %s",
PQgetvalue(res, i, i_memberid));
done[i - start] = true;
--remaining;
continue;
}
- if (PQgetisnull(res, i, i_grantor))
+ member = PQgetvalue(res, i, i_member);
+
+ /* If the grantor is unknown, complain and dump without it. */
+ grantorid = PQgetvalue(res, i, i_grantorid);
+ if (dump_grantor)
{
- /* translator: %s represents a numeric role OID */
- pg_log_warning("found orphaned pg_auth_members entry for role %s",
- PQgetvalue(res, i, i_grantorid));
- done[i - start] = true;
- --remaining;
- continue;
+ if (PQgetisnull(res, i, i_grantor))
+ {
+ /* translator: %s represents a numeric role OID */
+ pg_log_warning("grant of role \"%s\" to \"%s\" has invalid grantor OID %s",
+ role, member, grantorid);
+ pg_log_warning_detail("This grant will be dumped without GRANTED BY.");
+ dump_grantor = false;
+ }
+ else
+ grantor = PQgetvalue(res, i, i_grantor);
}
- member = PQgetvalue(res, i, i_member);
- grantor = PQgetvalue(res, i, i_grantor);
- grantorid = PQgetvalue(res, i, i_grantorid);
admin_option = PQgetvalue(res, i, i_admin_option);
if (dump_grant_options)
set_option = PQgetvalue(res, i, i_set_option);
/*
- * If we're not dumping grantors or if the grantor is the
+ * If we're not dumping the grantor or if the grantor is the
* bootstrap superuser, it's fine to dump this now. Otherwise,
* it's got to be someone who has already been granted ADMIN
* OPTION.
*/
- if (dump_grantors &&
+ if (dump_grantor &&
atooid(grantorid) != BOOTSTRAP_SUPERUSERID &&
rolename_lookup(ht, grantor) == NULL)
continue;
@@ -1486,7 +1497,7 @@ dumpRoleMembership(PGconn *conn)
}
if (optbuf->data[0] != '\0')
appendPQExpBuffer(querybuf, " WITH %s", optbuf->data);
- if (dump_grantors)
+ if (dump_grantor)
appendPQExpBuffer(querybuf, " GRANTED BY %s", fmtId(grantor));
appendPQExpBuffer(querybuf, ";\n");
On Fri, Feb 27, 2026 at 4:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: > > So I end with the attached draft patch. > > Sigh, this time with it really attached. I suggest that having both a variable called dump_grantor and one called dump_grantors is a little bit subtle, but other than that this looks good on a quick read-through. Regarding this point: > I'm inclined to > think that is an overreaction to the possible unreliability of the > data (and from your comment upthread you might agree). I think this is my code, so I certainly believed I had the right idea at the time, but we could revisit that. One thing to keep in mind is that in v15-, regardless of the notional grantor, in effect all grants are independent of the existence of any other user. In v16+, they form a tree structure, with grants depending on their grantors. So, when upgrading from v15- to v16+, we have to end up with a valid tree structure, but there's absolutely no reason to think that we already have one. If we don't, it must be better to discard the grantor than to fail the dump-and-restore altogether. Note that it's perfectly possible not to have a tree structure, e.g. because we have circular grants, even if all the roles involved still exist and have existed continuously. I believe my thought process at the time was there wasn't really any reason to imagine that whatever we were upgrading from v15- was intended as a tree structure, and therefore it made logical sense to treat it as though all of those grants directly emanated from the superuser, i.e. the tree is just 1 level deep. Now, that does mean losing the grantor information, but we're arguably preserving the semantics, since any it reproduces the v15- state where any of those roles are free to be dropped (assuming no other blockers). Anyway, again, we can rethink that, but if we do it without repairing any structural "tree defects," we'll end up with a dump that we can't reload on v16+. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Feb 27, 2026 at 4:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Sigh, this time with it really attached.
> I suggest that having both a variable called dump_grantor and one
> called dump_grantors is a little bit subtle, but other than that this
> looks good on a quick read-through.
Fair ... do you have a suggestion for less confusing names?
I considered naming the new variable "dump_this_grantor", but thought
it was longer without being more helpful ... but maybe you disagree.
> Regarding this point:
>> I'm inclined to
>> think that is an overreaction to the possible unreliability of the
>> data (and from your comment upthread you might agree).
> I think this is my code, so I certainly believed I had the right idea
> at the time, but we could revisit that. One thing to keep in mind is
> that in v15-, regardless of the notional grantor, in effect all grants
> are independent of the existence of any other user. In v16+, they form
> a tree structure, with grants depending on their grantors. So, when
> upgrading from v15- to v16+, we have to end up with a valid tree
> structure, but there's absolutely no reason to think that we already
> have one.
Yeah, that is certainly a hazard we'd have to worry about. As I said,
I'm content to leave it as-is for now.
regards, tom lane
On Fri, Feb 27, 2026 at 4:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > I suggest that having both a variable called dump_grantor and one > > called dump_grantors is a little bit subtle, but other than that this > > looks good on a quick read-through. > > Fair ... do you have a suggestion for less confusing names? > I considered naming the new variable "dump_this_grantor", but thought > it was longer without being more helpful ... but maybe you disagree. Personally, I'd find the extra verbosity helpful, but I don't care that much if you see it otherwise. > > I think this is my code, so I certainly believed I had the right idea > > at the time, but we could revisit that. One thing to keep in mind is > > that in v15-, regardless of the notional grantor, in effect all grants > > are independent of the existence of any other user. In v16+, they form > > a tree structure, with grants depending on their grantors. So, when > > upgrading from v15- to v16+, we have to end up with a valid tree > > structure, but there's absolutely no reason to think that we already > > have one. > > Yeah, that is certainly a hazard we'd have to worry about. As I said, > I'm content to leave it as-is for now. Yeah, sure. I was just mentioning it in case you were planning to pursue that in a separate thread. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Feb 27, 2026 at 4:52 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Fair ... do you have a suggestion for less confusing names?
>> I considered naming the new variable "dump_this_grantor", but thought
>> it was longer without being more helpful ... but maybe you disagree.
> Personally, I'd find the extra verbosity helpful, but I don't care
> that much if you see it otherwise.
If you think that's better, I'm happy to make it so. An author is
seldom the best judge of readability of his code.
Pushed like that.
regards, tom lane
On Mon, Mar 2, 2026 at 11:17 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Pushed like that. Thanks! -- Robert Haas EDB: http://www.enterprisedb.com