Обсуждение: pg_upgrade check for invalid role-specific default config
Hi all, While troubleshooting a failed upgrade from v11 -> v12 I realised I had encountered a bug previously reported on the pgsql-bugs mailing list: #14242 Role with a setconfig "role" setting to a nonexistent role causes pg_upgrade to fail https://www.postgresql.org/message-id/20160711223641.1426.86096%40wrigleys.postgresql.org To quote the previous report: > It is possible to modify the "role" setting in setconfig in the > pg_db_role_setting table such that it points to a nonexistent role. When > this is the case, restoring the output of pg_dumpall will fail due to the > missing role. > Steps to reproduce: > 1. As superuser, execute "create role foo with login password 'test'" > 2. As foo, execute "alter role foo set role = 'foo'" > 3. As superuser, execute "alter role foo rename to bar" > a. At this point, the setconfig entry in pg_db_role_setting for > 'bar' will contain '{role=foo}', which no longer exists > 4. Execute pg_upgrade with the recommended steps in > https://www.postgresql.org/docs/current/static/pgupgrade.html > During pg_upgrade (more specifically, during the restore of the output from > pg_dumpall), the "ALTER ROLE "bar" SET "role" TO 'foo'" command generated > will fail with "ERROR: role "foo" does not exist". > This issue was identified by Jordan Lange and Nathan Bossart. The steps in the original report reproduce the problem on all currently supported pg versions. I appreciate that the invalid role-specific default settings are ultimately self-inflicted by the user, but as a third-party performing the upgrade this caught me by surprise. Since it is possible to write a query to identify these cases, would there be appetite for me to submit a patch to add a check for this to pg_upgrade? First time mailing list user here so many apologies for any missteps I have made in this message. Best regards, Charlie Hornsby
On Mon, Apr 12, 2021 at 01:28:19PM +0000, Charlie Hornsby wrote: > Hi all, > > While troubleshooting a failed upgrade from v11 -> v12 I realised I had > encountered a bug previously reported on the pgsql-bugs mailing list: > > #14242 Role with a setconfig "role" setting to a nonexistent role causes > pg_upgrade to fail > > https://www.postgresql.org/message-id/20160711223641.1426.86096%40wrigleys.postgresql.org ... > Since it is possible to write a query to identify these cases, would there > be appetite for me to submit a patch to add a check for this to > pg_upgrade? > > First time mailing list user here so many apologies for any missteps I have > made in this message. Yes, I think a patch would be good, but the fix might be for pg_dump instead, which pg_upgrade uses. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Bruce Momjian <bruce@momjian.us> writes: > On Mon, Apr 12, 2021 at 01:28:19PM +0000, Charlie Hornsby wrote: >> While troubleshooting a failed upgrade from v11 -> v12 I realised I had >> encountered a bug previously reported on the pgsql-bugs mailing list: >> #14242 Role with a setconfig "role" setting to a nonexistent role causes >> pg_upgrade to fail >> https://www.postgresql.org/message-id/20160711223641.1426.86096%40wrigleys.postgresql.org >> Since it is possible to write a query to identify these cases, would there >> be appetite for me to submit a patch to add a check for this to >> pg_upgrade? > Yes, I think a patch would be good, but the fix might be for pg_dump > instead, which pg_upgrade uses. I'm not sure I buy the premise that "it is possible to write a query to identify these cases". It seems to me that the general problem is that ALTER ROLE/DATABASE SET values might have become incorrect since they were installed and would thus fail when reloaded in dump/restore. We're not going to be able to prevent that in the general case, and it's not obvious to me what special case might be worth going after. I do find it interesting that we now have two reports of somebody doing "ALTER ROLE SET role = something". In the older thread, I was skeptical that that had any real use-case, so I wonder if Charlie has a rationale for having done that. regards, tom lane
I wrote: > I'm not sure I buy the premise that "it is possible to write a query > to identify these cases". It seems to me that the general problem is > that ALTER ROLE/DATABASE SET values might have become incorrect since > they were installed and would thus fail when reloaded in dump/restore. > We're not going to be able to prevent that in the general case, and > it's not obvious to me what special case might be worth going after. Actually, after thinking about that a bit more, this is a whole lot like the issues we have with reloading function bodies and function SET clauses. The solution we've adopted for that is to allow dumps to turn off validation via the check_function_bodies GUC. Maybe there should be a GUC to disable validation of ALTER ROLE/DATABASE SET values. If you fat-finger a setting, you might not be able to log in, but you couldn't log in in the old database either. Another answer is that maybe the processing of the "role" case in particular is just broken. Compare the behavior here: regression=# create role joe; CREATE ROLE regression=# alter role joe set role = 'notthere'; ERROR: role "notthere" does not exist regression=# alter role joe set default_text_search_config to 'notthere'; NOTICE: text search configuration "notthere" does not exist ALTER ROLE # \drds List of settings Role | Database | Settings ------+------------+------------------------------------- joe | | default_text_search_config=notthere despite the fact that a direct SET fails: regression=# set default_text_search_config to 'notthere'; ERROR: invalid value for parameter "default_text_search_config": "notthere" It's intentional that we don't throw a hard error for default_text_search_config, because that would create a problematic ordering dependency for pg_dump: the desired config might just not have been reloaded yet. Maybe the right answer here is that the processing of "set role" in particular failed to get that memo. regards, tom lane
I wrote: > Another answer is that maybe the processing of the "role" case > in particular is just broken. After digging around a bit more, I think that that is indeed the right answer. Most of the GUC check functions that have database-state-dependent behavior are programmed to behave specially when checking a proposed ALTER USER/DATABASE setting; but check_role and check_session_authorization did not get that memo. I also noted that check_temp_buffers would throw an error for no very good reason. There don't look to be any other troublesome cases. So I ended up with the attached. It feels a bit unsatisfactory to have these various check functions know about this explicitly. However, I'm not sure that there's a good way to centralize it. Only the check function knows whether the check it's making is immutable or dependent on DB state --- and in the former case, not throwing an error wouldn't be an improvement. Anyway, I'm inclined to think that we should not only apply this but back-patch it. Two complaints is enough to suggest we have an issue. Plus, I think there is a dump/reload ordering problem here: as things stand, "alter user joe set role = bob" would work or not work depending on the order the roles are created in and/or the order privileges are granted in. regards, tom lane diff --git a/src/backend/commands/variable.c b/src/backend/commands/variable.c index c5cf08b423..0c85679420 100644 --- a/src/backend/commands/variable.c +++ b/src/backend/commands/variable.c @@ -767,6 +767,17 @@ check_session_authorization(char **newval, void **extra, GucSource source) roleTup = SearchSysCache1(AUTHNAME, PointerGetDatum(*newval)); if (!HeapTupleIsValid(roleTup)) { + /* + * When source == PGC_S_TEST, we don't throw a hard error for a + * nonexistent user name, only a NOTICE. See comments in guc.h. + */ + if (source == PGC_S_TEST) + { + ereport(NOTICE, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("role \"%s\" does not exist", *newval))); + return true; + } GUC_check_errmsg("role \"%s\" does not exist", *newval); return false; } @@ -837,10 +848,23 @@ check_role(char **newval, void **extra, GucSource source) return false; } + /* + * When source == PGC_S_TEST, we don't throw a hard error for a + * nonexistent user name or insufficient privileges, only a NOTICE. + * See comments in guc.h. + */ + /* Look up the username */ roleTup = SearchSysCache1(AUTHNAME, PointerGetDatum(*newval)); if (!HeapTupleIsValid(roleTup)) { + if (source == PGC_S_TEST) + { + ereport(NOTICE, + (errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("role \"%s\" does not exist", *newval))); + return true; + } GUC_check_errmsg("role \"%s\" does not exist", *newval); return false; } @@ -859,6 +883,14 @@ check_role(char **newval, void **extra, GucSource source) if (!InitializingParallelWorker && !is_member_of_role(GetSessionUserId(), roleid)) { + if (source == PGC_S_TEST) + { + ereport(NOTICE, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + errmsg("permission will be denied to set role \"%s\"", + *newval))); + return true; + } GUC_check_errcode(ERRCODE_INSUFFICIENT_PRIVILEGE); GUC_check_errmsg("permission denied to set role \"%s\"", *newval); diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index d0a51b507d..2ffefdf943 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -11777,8 +11777,9 @@ check_temp_buffers(int *newval, void **extra, GucSource source) { /* * Once local buffers have been initialized, it's too late to change this. + * However, if this is only a test call, allow it. */ - if (NLocBuffer && NLocBuffer != *newval) + if (source != PGC_S_TEST && NLocBuffer && NLocBuffer != *newval) { GUC_check_errdetail("\"temp_buffers\" cannot be changed after any temporary tables have been accessed in the session."); return false;
Tom wrote: > I do find it interesting that we now have two reports of somebody > doing "ALTER ROLE SET role = something". In the older thread, > I was skeptical that that had any real use-case, so I wonder if > Charlie has a rationale for having done that. Unfortunately I haven't heard back from the original developer who set up this role configuration, but if I do then I will share their intentions. In any case the invalid configuration had been removed from every other role except one (certainly by mistake) which lead to me rediscovering this issue. I tested the above patch with the invalid data locally and it avoids the restore error that we ran into previously. Also it requires no intervention to progress with pg_upgrade unlike my initial idea of adding an check, so it is definitely simpler from a user perspective. Thank you for taking a deep look into this and finding a better solution. Best regards, Charlie Hornsby
Charlie Hornsby <charlie.hornsby@hotmail.co.uk> writes: > I tested the above patch with the invalid data locally and it avoids > the restore error that we ran into previously. Also it requires no > intervention to progress with pg_upgrade unlike my initial idea of > adding an check, so it is definitely simpler from a user perspective. Thanks for testing! I've pushed this, so it will be in the May minor releases. regards, tom lane