Обсуждение: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

Поиск
Список
Период
Сортировка

Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi hackers,

Please find attached a patch proposal to $SUBJECT.

This patch allows the role provided in BackgroundWorkerInitializeConnection()
and BackgroundWorkerInitializeConnectionByOid() to lack login authorization.

In InitPostgres(), in case of a background worker, authentication is not performed
(PerformAuthentication() is not called), so having the role used to connect to the database
lacking login authorization seems to make sense.

With this new flag in place, one could give "high" privileges to the role used to initialize
the background workers connections without any risk of seeing this role being used by a
"normal user" to login.

The attached patch:

- adds the new flag
- adds documentation
- adds testing

Looking forward to your feedback,

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Thu, Sep 28, 2023 at 02:37:02PM +0200, Drouvot, Bertrand wrote:
> This patch allows the role provided in BackgroundWorkerInitializeConnection()
> and BackgroundWorkerInitializeConnectionByOid() to lack login authorization.

Interesting.  Yes, there would be use cases for that, I suppose.

> +             uint32 flags,
>               char *out_dbname)
>  {

This may be more adapted with a bits32 for the flags.

> +# Ask the background workers to connect with this role with the flag in place.
> +$node->append_conf(
> +    'postgresql.conf', q{
> +worker_spi.role = 'nologrole'
> +worker_spi.bypass_login_check = true
> +});
> +$node->restart;
> +
> +# An error message should not be issued.
> +ok( !$node->log_contains(
> +        "role \"nologrole\" is not permitted to log in", $log_start),
> +    "nologrole allowed to connect if BGWORKER_BYPASS_ROLELOGINCHECK is set");
> +
>  done_testing();

It would be cheaper to use a dynamic background worker for such tests.
Something that I've been tempted to do in this module is to extend the
amount of data that's given to bgw_main_arg when launching a worker
with worker_spi_launch().  How about extending the SQL function so as
it is possible to give in input a role name (or a regrole), a database
name (or a database OID) and a text[] for the flags?  This would
require a bit more refactoring, but this would be benefitial to show
or one can pass down a full structure from the registration to the
main() routine.  On top of that, it would make the addition of the new
GUCs worker_spi.bypass_login_check and worker_spi.role unnecessary.

> +# return the size of logfile of $node in bytes
> +sub get_log_size
> +{
> +    my ($node) = @_;
> +
> +    return (stat $node->logfile)[7];
> +}

Just use -s here.  See other tests that want to check the contents of
the logs from an offset.

> - * Allow bypassing datallowconn restrictions when connecting to database
> + * Allow bypassing datallowconn restrictions and login check when connecting
> + * to database
>   */
> -#define BGWORKER_BYPASS_ALLOWCONN 1
> +#define BGWORKER_BYPASS_ALLOWCONN 0x0001
> +#define BGWORKER_BYPASS_ROLELOGINCHECK 0x0002

The structure of the patch is inconsistent.  These flags are in
bgworker.h, but they are used also by InitPostgres().  Perhaps a
second boolean flag would be OK rather than a second set of flags for
InitPostgres() mapping with the bgworker set.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi,

On 9/29/23 8:19 AM, Michael Paquier wrote:
> On Thu, Sep 28, 2023 at 02:37:02PM +0200, Drouvot, Bertrand wrote:
>> This patch allows the role provided in BackgroundWorkerInitializeConnection()
>> and BackgroundWorkerInitializeConnectionByOid() to lack login authorization.
> 
> Interesting.  Yes, there would be use cases for that, I suppose.

Thanks for looking at it!

> 
>> +             uint32 flags,
>>                char *out_dbname)
>>   {
> 
> This may be more adapted with a bits32 for the flags.

Done that way in v2 attached.

> 
>> +# Ask the background workers to connect with this role with the flag in place.
>> +$node->append_conf(
>> +    'postgresql.conf', q{
>> +worker_spi.role = 'nologrole'
>> +worker_spi.bypass_login_check = true
>> +});
>> +$node->restart;
>> +
>> +# An error message should not be issued.
>> +ok( !$node->log_contains(
>> +        "role \"nologrole\" is not permitted to log in", $log_start),
>> +    "nologrole allowed to connect if BGWORKER_BYPASS_ROLELOGINCHECK is set");
>> +
>>   done_testing();
> 
> It would be cheaper to use a dynamic background worker for such tests.
> Something that I've been tempted to do in this module is to extend the
> amount of data that's given to bgw_main_arg when launching a worker
> with worker_spi_launch().  How about extending the SQL function so as
> it is possible to give in input a role name (or a regrole), a database
> name (or a database OID) and a text[] for the flags?  This would
> require a bit more refactoring, but this would be benefitial to show
> or one can pass down a full structure from the registration to the
> main() routine.  On top of that, it would make the addition of the new
> GUCs worker_spi.bypass_login_check and worker_spi.role unnecessary.
> 

I think that would make sense to have more flexibility in the worker_spi
module. I think that could be done in a dedicated patch though. I think it makes
more sense to have the current patch "focusing" on this new flag (while adding a test
about it without too much refactoring). What about doing the worker_spi module
re-factoring as a follow up of this one?

>> +# return the size of logfile of $node in bytes
>> +sub get_log_size
>> +{
>> +    my ($node) = @_;
>> +
>> +    return (stat $node->logfile)[7];
>> +}
> 
> Just use -s here.

Done in v2 attached.

> See other tests that want to check the contents of
> the logs from an offset.
> 

Oh right, worth to modify 019_replslot_limit.pl, 002_corrupted.pl and
001_pg_controldata.pl in a separate patch for consistency? (they are using
"(stat $node->logfile)[7]" or "(stat($pg_control))[7]").

>> - * Allow bypassing datallowconn restrictions when connecting to database
>> + * Allow bypassing datallowconn restrictions and login check when connecting
>> + * to database
>>    */
>> -#define BGWORKER_BYPASS_ALLOWCONN 1
>> +#define BGWORKER_BYPASS_ALLOWCONN 0x0001
>> +#define BGWORKER_BYPASS_ROLELOGINCHECK 0x0002
> 
> The structure of the patch is inconsistent.  These flags are in
> bgworker.h, but they are used also by InitPostgres().  Perhaps a
> second boolean flag would be OK rather than a second set of flags for
> InitPostgres() mapping with the bgworker set.

I did not want initially to add an extra parameter to InitPostgres() but
I agree it would make more sense. So done that way in v2.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Mon, Oct 02, 2023 at 10:01:04AM +0200, Drouvot, Bertrand wrote:
> I think that would make sense to have more flexibility in the worker_spi
> module. I think that could be done in a dedicated patch though. I
> think it makes more sense to have the current patch "focusing" on
> this new flag (while adding a test about it without too much
> refactoring). What about doing the worker_spi module  re-factoring
> as a follow up of this one?

I would do that first, as that's what I usually do, but I see also
your point that this is not mandatory.  If you want, I could give it a
shot tomorrow to see where it leads.

> Oh right, worth to modify 019_replslot_limit.pl, 002_corrupted.pl and
> 001_pg_controldata.pl in a separate patch for consistency? (they are using
> "(stat $node->logfile)[7]" or "(stat($pg_control))[7]").

Indeed, that's strange.  Let's remove the dependency to stat here.
The other solution is slightly more elegant IMO, as we don't rely on
the position of the result from stat().
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi,

On 10/2/23 10:17 AM, Michael Paquier wrote:
> On Mon, Oct 02, 2023 at 10:01:04AM +0200, Drouvot, Bertrand wrote:
>> I think that would make sense to have more flexibility in the worker_spi
>> module. I think that could be done in a dedicated patch though. I
>> think it makes more sense to have the current patch "focusing" on
>> this new flag (while adding a test about it without too much
>> refactoring). What about doing the worker_spi module  re-factoring
>> as a follow up of this one?
> 
> I would do that first, as that's what I usually do,

The reason I was thinking not doing that first is that there is no real use
case in the current worker_spi module test.

> but I see also
> your point that this is not mandatory.  If you want, I could give it a
> shot tomorrow to see where it leads.

Oh yeah that would be great (and maybe you already see a use case in the
current test). Thanks!

>> Oh right, worth to modify 019_replslot_limit.pl, 002_corrupted.pl and
>> 001_pg_controldata.pl in a separate patch for consistency? (they are using
>> "(stat $node->logfile)[7]" or "(stat($pg_control))[7]").
> 
> Indeed, that's strange.  Let's remove the dependency to stat here.
> The other solution is slightly more elegant IMO, as we don't rely on
> the position of the result from stat().

Agree, I will propose a new patch for this.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Bharath Rupireddy
Дата:
On Mon, Oct 2, 2023 at 4:58 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> On 9/29/23 8:19 AM, Michael Paquier wrote:
> > On Thu, Sep 28, 2023 at 02:37:02PM +0200, Drouvot, Bertrand wrote:
> >> This patch allows the role provided in BackgroundWorkerInitializeConnection()
> >> and BackgroundWorkerInitializeConnectionByOid() to lack login authorization.
> >
> > Interesting.  Yes, there would be use cases for that, I suppose.

Correct. It allows the roles that don't have LOGIN capabilities to
start and use bg workers.

> > This may be more adapted with a bits32 for the flags.
>
> Done that way in v2 attached.

While I like the idea of the flag to skip login checks for bg workers,
I don't quite like the APIs being changes InitializeSessionUserId and
InitPostgres (adding a new input parameter),
BackgroundWorkerInitializeConnection and
BackgroundWorkerInitializeConnectionByOid (changing of input parameter
type) given that all of these functions are available for external
modules and will break things for sure.

What if BGWORKER_BYPASS_ROLELOGINCHECK be part of bgw_flags? With
this, none of the API needs to be changed, so no compatibility
problems as such for external modules and the InitializeSessionUserId
can just do something like [1]. We might be tempted to add
BGWORKER_BYPASS_ALLOWCONN also to bgw_flags, but I'd prefer not to do
it for the same compatibility reasons.

Thoughts?

[1]
diff --git a/src/backend/utils/init/miscinit.c
b/src/backend/utils/init/miscinit.c
index 1e671c560c..27dcf052ab 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -786,10 +786,17 @@ InitializeSessionUserId(const char *rolename, Oid roleid)
         */
        if (IsUnderPostmaster)
        {
+               bool    skip_check = false;
+
+               /* If asked, skip the role login check for background
workers. */
+               if (IsBackgroundWorker &&
+                       (MyBgworkerEntry->bgw_flags &
BGWORKER_BYPASS_ROLELOGINCHECK) != 0)
+               skip_check = true;
+
                /*
                 * Is role allowed to login at all?
                 */
-               if (!rform->rolcanlogin)
+               if (!skip_check && !rform->rolcanlogin)
                        ereport(FATAL,

(errcode(ERRCODE_INVALID_AUTHORIZATION_SPECIFICATION),
                                         errmsg("role \"%s\" is not
permitted to log in",

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi,

On 10/3/23 11:21 AM, Bharath Rupireddy wrote:
> On Mon, Oct 2, 2023 at 4:58 PM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
>>
>> On 9/29/23 8:19 AM, Michael Paquier wrote:
>>> On Thu, Sep 28, 2023 at 02:37:02PM +0200, Drouvot, Bertrand wrote:
>>>> This patch allows the role provided in BackgroundWorkerInitializeConnection()
>>>> and BackgroundWorkerInitializeConnectionByOid() to lack login authorization.
>>>
>>> Interesting.  Yes, there would be use cases for that, I suppose.
> 
> Correct. It allows the roles that don't have LOGIN capabilities to
> start and use bg workers.
> 
>>> This may be more adapted with a bits32 for the flags.
>>
>> Done that way in v2 attached.
> 
> While I like the idea of the flag to skip login checks for bg workers,
> I don't quite like the APIs being changes InitializeSessionUserId and
> InitPostgres (adding a new input parameter),
> BackgroundWorkerInitializeConnection and
> BackgroundWorkerInitializeConnectionByOid (changing of input parameter
> type) given that all of these functions are available for external
> modules and will break things for sure.
> 
> What if BGWORKER_BYPASS_ROLELOGINCHECK be part of bgw_flags? With
> this, none of the API needs to be changed, so no compatibility
> problems as such for external modules and the InitializeSessionUserId
> can just do something like [1]. We might be tempted to add
> BGWORKER_BYPASS_ALLOWCONN also to bgw_flags, but I'd prefer not to do
> it for the same compatibility reasons.
> 
> Thoughts?
> 

Thanks for looking at it!

I did some research and BGWORKER_BYPASS_ALLOWCONN has been added in eed1ce72e1 and
at that time the bgw_flags did already exist.

In this the related thread [1], Tom mentioned:

"
We change exported APIs in new major versions all the time.  As
long as it's just a question of an added parameter, people can deal
with it.
"

And I agree with that.

Now, I understand your point but it looks to me that bgw_flags is more
about the capabilities (Access to shared memory with BGWORKER_SHMEM_ACCESS
or ability to establish database connection with BGWORKER_BACKEND_DATABASE_CONNECTION),

While with BGWORKER_BYPASS_ROLELOGINCHECK (and BGWORKER_BYPASS_ALLOWCONN) it's more related to
the BGW behavior once the capability is in place.

So, I think I'm fine with the current proposal and don't see the need to move
BGWORKER_BYPASS_ROLELOGINCHECK in bgw_flags.

[1]: https://www.postgresql.org/message-id/22769.1519323861%40sss.pgh.pa.us

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Bharath Rupireddy
Дата:
On Tue, Oct 3, 2023 at 5:45 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> > While I like the idea of the flag to skip login checks for bg workers,
> > I don't quite like the APIs being changes InitializeSessionUserId and
> > InitPostgres (adding a new input parameter),
> > BackgroundWorkerInitializeConnection and
> > BackgroundWorkerInitializeConnectionByOid (changing of input parameter
> > type) given that all of these functions are available for external
> > modules and will break things for sure.
> >
> > What if BGWORKER_BYPASS_ROLELOGINCHECK be part of bgw_flags? With
> > this, none of the API needs to be changed, so no compatibility
> > problems as such for external modules and the InitializeSessionUserId
> > can just do something like [1]. We might be tempted to add
> > BGWORKER_BYPASS_ALLOWCONN also to bgw_flags, but I'd prefer not to do
> > it for the same compatibility reasons.
> >
> > Thoughts?
> >
>
> Thanks for looking at it!
>
> I did some research and BGWORKER_BYPASS_ALLOWCONN has been added in eed1ce72e1 and
> at that time the bgw_flags did already exist.
>
> In this the related thread [1], Tom mentioned:
>
> "
> We change exported APIs in new major versions all the time.  As
> long as it's just a question of an added parameter, people can deal
> with it.
> "

It doesn't have to be always/all the time. If the case here is okay to
change the bgw and other core functions API, I honestly feel that we
must move BGWORKER_BYPASS_ALLOWCONN to bgw_flags.

> Now, I understand your point but it looks to me that bgw_flags is more
> about the capabilities (Access to shared memory with BGWORKER_SHMEM_ACCESS
> or ability to establish database connection with BGWORKER_BACKEND_DATABASE_CONNECTION),
>
> While with BGWORKER_BYPASS_ROLELOGINCHECK (and BGWORKER_BYPASS_ALLOWCONN) it's more related to
> the BGW behavior once the capability is in place.

I look at the new flag as a capability of the bgw to connect with a
role without login access. IMV, all are the same.

> So, I think I'm fine with the current proposal and don't see the need to move
> BGWORKER_BYPASS_ROLELOGINCHECK in bgw_flags.
>
> [1]: https://www.postgresql.org/message-id/22769.1519323861%40sss.pgh.pa.us

I prefer to have it as bgw_flag, however, let's hear from others.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Tue, Oct 03, 2023 at 07:02:11PM +0530, Bharath Rupireddy wrote:
> On Tue, Oct 3, 2023 at 5:45 PM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
>> I did some research and BGWORKER_BYPASS_ALLOWCONN has been added in eed1ce72e1 and
>> at that time the bgw_flags did already exist.
>>
>> In this the related thread [1], Tom mentioned:
>>
>> "
>> We change exported APIs in new major versions all the time.  As
>> long as it's just a question of an added parameter, people can deal
>> with it.
>> "
>
> It doesn't have to be always/all the time. If the case here is okay to
> change the bgw and other core functions API, I honestly feel that we
> must move BGWORKER_BYPASS_ALLOWCONN to bgw_flags.

I don't agree with this point.  BackgroundWorkerInitializeConnection()
and its other friend are designed to be called at the beginning of the
main routine of a background worker, where bgw_flags is not accessible.
There is much more happening before a bgworker initializes its
connection, like signal handling and definitions of other states that
depend on the GUCs loaded for the bgworker.

>> Now, I understand your point but it looks to me that bgw_flags is more
>> about the capabilities (Access to shared memory with BGWORKER_SHMEM_ACCESS
>> or ability to establish database connection with BGWORKER_BACKEND_DATABASE_CONNECTION),
>>
>> While with BGWORKER_BYPASS_ROLELOGINCHECK (and BGWORKER_BYPASS_ALLOWCONN) it's more related to
>> the BGW behavior once the capability is in place.
>
> I look at the new flag as a capability of the bgw to connect with a
> role without login access. IMV, all are the same.

Bertrand is arguing that the current code with its current split is
OK, because both are different concepts:
- bgw_flags is used by the postmaster to control how to launch the
bgworkers.
- The BGWORKER_* flags are used by the bgworkers themselves, once
things are set up by the postmaster based on bgw_flags.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Mon, Oct 02, 2023 at 10:53:22AM +0200, Drouvot, Bertrand wrote:
> On 10/2/23 10:17 AM, Michael Paquier wrote:
>> On Mon, Oct 02, 2023 at 10:01:04AM +0200, Drouvot, Bertrand wrote:
>>> I think that would make sense to have more flexibility in the worker_spi
>>> module. I think that could be done in a dedicated patch though. I
>>> think it makes more sense to have the current patch "focusing" on
>>> this new flag (while adding a test about it without too much
>>> refactoring). What about doing the worker_spi module  re-factoring
>>> as a follow up of this one?
>>
>> I would do that first, as that's what I usually do,
>
> The reason I was thinking not doing that first is that there is no real use
> case in the current worker_spi module test.

As a template, improving and extending it seems worth it to me as long
as it can also improve tests.

> > but I see also
> > your point that this is not mandatory.  If you want, I could give it a
> > shot tomorrow to see where it leads.
>
> Oh yeah that would be great (and maybe you already see a use case in the
> current test). Thanks!

Yes, while it shows a bit more what one can achieve with bgw_extra, it
also helps in improving the test coverage with starting dynamic
workers across defined databases and roles through a SQL function.

It took me a bit longer than I expected, but here is what I finish
with:
- 0001 extends worker_spi to be able to pass down database and role
IDs for the workers to start, through MyBgworkerEntry->bgw_extra.
Perhaps the two new arguments of worker_spi_launch() should use
InvalidOid as default, actually, to fall back to the database and
roles defined by the GUCs in these cases.  That would be two lines to
change in worker_spi--1.0.sql.
- 0002 is your patch, on top of which I have added handling for the
flags in the launch() function with a text[].  The tests get much
simpler, and don't need restarts.

By the way, I am pretty sure that we are going to need a wait phase
after the startup of the worker in the case where "nologrole" is not
allowed to log in even with the original patch: the worker may not
have started at the point where we check the logs.  That's true as
well when involving worker_spi_launch().

What do you think?
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi,

On 10/4/23 8:20 AM, Michael Paquier wrote:
> On Mon, Oct 02, 2023 at 10:53:22AM +0200, Drouvot, Bertrand wrote:
>> On 10/2/23 10:17 AM, Michael Paquier wrote:
>>> On Mon, Oct 02, 2023 at 10:01:04AM +0200, Drouvot, Bertrand wrote:
>>>> I think that would make sense to have more flexibility in the worker_spi
>>>> module. I think that could be done in a dedicated patch though. I
>>>> think it makes more sense to have the current patch "focusing" on
>>>> this new flag (while adding a test about it without too much
>>>> refactoring). What about doing the worker_spi module  re-factoring
>>>> as a follow up of this one?
>>>
>>> I would do that first, as that's what I usually do,
>>
>> The reason I was thinking not doing that first is that there is no real use
>> case in the current worker_spi module test.
> 
> As a template, improving and extending it seems worth it to me as long
> as it can also improve tests.
> 
>>> but I see also
>>> your point that this is not mandatory.  If you want, I could give it a
>>> shot tomorrow to see where it leads.
>>
>> Oh yeah that would be great (and maybe you already see a use case in the
>> current test). Thanks!
> 
> Yes, while it shows a bit more what one can achieve with bgw_extra, it
> also helps in improving the test coverage with starting dynamic
> workers across defined databases and roles through a SQL function.
> 

Yeah right.

> It took me a bit longer than I expected, 

Thanks for having looked at it!

> but here is what I finish
> with:
> - 0001 extends worker_spi to be able to pass down database and role
> IDs for the workers to start, through MyBgworkerEntry->bgw_extra.
> Perhaps the two new arguments of worker_spi_launch() should use
> InvalidOid as default, actually, to fall back to the database and
> roles defined by the GUCs in these cases. 

I'm fine with the way it's currently done in 0001 and that sounds
more logical to me. I mean we don't "really" want InvalidOid but to fall
back to the GUCs.

Just a remark here:

+       if (!OidIsValid(roleoid))
+       {
+               /*
+                * worker_spi_role is NULL by default, so just pass down an invalid
+                * OID to let the main() routine do its connection work.
+                */
+               if (worker_spi_role)
+                       roleoid = get_role_oid(worker_spi_role, false);
+               else
+                       roleoid = InvalidOid;

the

+               else
+                       roleoid = InvalidOid;

I think it is not needed as we're already in "!OidIsValid(roleoid)".

> - 0002 is your patch, on top of which I have added handling for the
> flags in the launch() function with a text[].  The tests get much
> simpler, and don't need restarts.
> 

Yeah, agree that's better.

> By the way, I am pretty sure that we are going to need a wait phase
> after the startup of the worker in the case where "nologrole" is not
> allowed to log in even with the original patch: the worker may not
> have started at the point where we check the logs.

I agree and it's now failing on my side.
I added this "wait" in v4-0002 attach and it's now working fine.

Please note there is more changes than adding this wait in 001_worker_spi.pl (as compare
to v3-0002) as I ran pgperltidy on it.
FWIW, the new "wait" is just the part related to "nb_errors".

> What do you think?

Except the Nit that I mentioned in 0001, that looks all good to me (with the
new wait in 001_worker_spi.pl).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Wed, Oct 04, 2023 at 12:54:24PM +0200, Drouvot, Bertrand wrote:
> Except the Nit that I mentioned in 0001, that looks all good to me (with the
> new wait in 001_worker_spi.pl).

Thanks, I've applied the refactoring, including in it the stuff to be
able to control the flags used when launching a dynamic worker.  0001
also needed some indenting.  You'll notice that the diffs in
worker_spi are minimal now.  worker_spi_main is no more, as an effect
of..  Cough..  c8e318b1b.

+# An error message should be issued.
+my $nb_errors = 0;
+for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
+{
+    if ($node->log_contains(
+            "role \"nologrole\" is not permitted to log in", $log_start))
+    {
+        $nb_errors = 1;
+        last;
+    }
+    usleep(100_000);
+}

This can be switched to use $node->wait_for_log, making the test
simpler.  No need for Time::HiRes, either.

-extern void BackgroundWorkerInitializeConnection(const char *dbname,
const char *username, uint32 flags);
+extern void BackgroundWorkerInitializeConnection(const char *dbname,
const char *username, bits32 flags);
[...]
-BackgroundWorkerInitializeConnectionByOid(Oid dboid, Oid useroid,
uint32 flags)
+BackgroundWorkerInitializeConnectionByOid(Oid dboid, Oid useroid,
bits32 flags)

That's changing signatures just for the sake of breaking them.  I
would leave that alone, I guess..

@@ -719,6 +719,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
              const char *username, Oid useroid,
              bool load_session_libraries,
              bool override_allow_connections,
+             bool bypass_login_check,
              char *out_dbname)

I was not paying much attention here, but load_session_libraries gives
a good argument in favor of switching all these booleans to a single
bits32 argument as that would make three of them now, with a different
set of flags than the bgworker ones.  This can be refactored on its
own.

-#define BGWORKER_BYPASS_ALLOWCONN 1
+#define BGWORKER_BYPASS_ALLOWCONN 0x0001

Can be a change of its own as well.

While looking at the refactoring at worker_spi.  I've noticed that it
would be simple and cheap to have some coverage for BYPASS_ALLOWCONN,
something we've never done since this has been introduced.  Let me
suggest 0001 to add some coverage.

0002 is your own patch, with the tests simplified a bit.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi,

On 10/5/23 7:10 AM, Michael Paquier wrote:
> On Wed, Oct 04, 2023 at 12:54:24PM +0200, Drouvot, Bertrand wrote:
>> Except the Nit that I mentioned in 0001, that looks all good to me (with the
>> new wait in 001_worker_spi.pl).
> 
> Thanks, I've applied the refactoring, including in it the stuff to be
> able to control the flags used when launching a dynamic worker.  0001
> also needed some indenting.  You'll notice that the diffs in
> worker_spi are minimal now.  worker_spi_main is no more, as an effect
> of..  Cough..  c8e318b1b.

Thanks!

> +# An error message should be issued.
> +my $nb_errors = 0;
> +for (my $i = 0; $i < 10 * $PostgreSQL::Test::Utils::timeout_default; $i++)
> +{
> +    if ($node->log_contains(
> +            "role \"nologrole\" is not permitted to log in", $log_start))
> +    {
> +        $nb_errors = 1;
> +        last;
> +    }
> +    usleep(100_000);
> +}
> 
> This can be switched to use $node->wait_for_log, making the test
> simpler.  No need for Time::HiRes, either.
> 

Oh, thanks, did not know about $node->wait_for_log, good to know!

> -extern void BackgroundWorkerInitializeConnection(const char *dbname,
> const char *username, uint32 flags);
> +extern void BackgroundWorkerInitializeConnection(const char *dbname,
> const char *username, bits32 flags);
> [...]
> -BackgroundWorkerInitializeConnectionByOid(Oid dboid, Oid useroid,
> uint32 flags)
> +BackgroundWorkerInitializeConnectionByOid(Oid dboid, Oid useroid,
> bits32 flags)
> 
> That's changing signatures just for the sake of breaking them.  I
> would leave that alone, I guess..

Ok, switched back to uint32 in v6-0002 attached (v6-0001 is your v5-0001
unchanged).

> 
> @@ -719,6 +719,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
>                const char *username, Oid useroid,
>                bool load_session_libraries,
>                bool override_allow_connections,
> +             bool bypass_login_check,
>                char *out_dbname)
> 
> I was not paying much attention here, but load_session_libraries gives
> a good argument in favor of switching all these booleans to a single
> bits32 argument as that would make three of them now, with a different
> set of flags than the bgworker ones.  This can be refactored on its
> own.

Yeah good point, will work on it once the current one is committed.

> 
> -#define BGWORKER_BYPASS_ALLOWCONN 1
> +#define BGWORKER_BYPASS_ALLOWCONN 0x0001
> 
> Can be a change of its own as well.

Yeah, but I think it's simple enough to just keep this change here
(and I don't think it's "really" needed without introducing 0x0002)

> 
> While looking at the refactoring at worker_spi.  I've noticed that it
> would be simple and cheap to have some coverage for BYPASS_ALLOWCONN,
> something we've never done since this has been introduced.  Let me
> suggest 0001 to add some coverage.

Good idea! I looked at 0001 and it looks ok to me.

> 
> 0002 is your own patch, with the tests simplified a bit.

Thanks, LGTM.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Bharath Rupireddy
Дата:
On Thu, Oct 5, 2023 at 12:22 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> >
> > @@ -719,6 +719,7 @@ InitPostgres(const char *in_dbname, Oid dboid,
> >                        const char *username, Oid useroid,
> >                        bool load_session_libraries,
> >                        bool override_allow_connections,
> > +                      bool bypass_login_check,
> >                        char *out_dbname)
> >
> > I was not paying much attention here, but load_session_libraries gives
> > a good argument in favor of switching all these booleans to a single
> > bits32 argument as that would make three of them now, with a different
> > set of flags than the bgworker ones.  This can be refactored on its
> > own.
>
> Yeah good point, will work on it once the current one is committed.
>
> >
> > -#define BGWORKER_BYPASS_ALLOWCONN 1
> > +#define BGWORKER_BYPASS_ALLOWCONN 0x0001
> >
> > Can be a change of its own as well.
>
> Yeah, but I think it's simple enough to just keep this change here
> (and I don't think it's "really" needed without introducing 0x0002)

I think changing BGWORKER_BYPASS_ALLOWCONN to 0x0001 and having bits32
for InitPostgres inputs load_session_libraries and
override_allow_connections can be 0001 in this patch series so that it
can go first, no? This avoids the new code being in the old format and
changing things right after it commits.

v6-0001 LGTM.

A comment on v6-0002:
1.
+  CREATE ROLE nologrole with nologin;
+  ALTER ROLE nologrole with superuser;
+]);
We don't need superuser privileges here, do we? Or do we need it for
the worker_spi to access pg_catalog and stuff in worker_spi_main? If
not, can we remove it to showcase non-superusers requesting bg
workers?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi,

On 10/5/23 2:21 PM, Bharath Rupireddy wrote:
> On Thu, Oct 5, 2023 at 12:22 PM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
>>
> A comment on v6-0002:
> 1.
> +  CREATE ROLE nologrole with nologin;
> +  ALTER ROLE nologrole with superuser;
> +]);
> We don't need superuser privileges here, do we? Or do we need it for
> the worker_spi to access pg_catalog and stuff in worker_spi_main? If
> not, can we remove it to showcase non-superusers requesting bg
> workers?

superuser is not needed here.
I removed it but had to change it in v7 attached to:

+  CREATE ROLE nologrole with nologin;
+  GRANT CREATE ON DATABASE mydb TO nologrole;

To avoid things like:

"
2023-10-05 15:59:39.189 UTC [2830732] LOG:  worker_spi dynamic worker 13 initialized with schema13.counted
2023-10-05 15:59:39.191 UTC [2830732] ERROR:  permission denied for database mydb
2023-10-05 15:59:39.191 UTC [2830732] CONTEXT:  SQL statement "CREATE SCHEMA "schema13" CREATE TABLE "counted"
"

Regards,
  
-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Bharath Rupireddy
Дата:
On Thu, Oct 5, 2023 at 9:32 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> +  CREATE ROLE nologrole with nologin;
> +  GRANT CREATE ON DATABASE mydb TO nologrole;

A few nit-picks:

1. s/with/WITH
2. s/nologin/NOLOGIN
3. +   is specified as <varname>flags</varname> it is possible to
bypass the login check to connect to databases.
How about "it is possible to bypass the login check for the role used
to connect to databases."?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Nathan Bossart
Дата:
On Thu, Oct 05, 2023 at 05:51:31PM +0530, Bharath Rupireddy wrote:
> I think changing BGWORKER_BYPASS_ALLOWCONN to 0x0001 and having bits32
> for InitPostgres inputs load_session_libraries and
> override_allow_connections can be 0001 in this patch series so that it
> can go first, no? This avoids the new code being in the old format and
> changing things right after it commits.

+1

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Thu, Oct 05, 2023 at 05:51:31PM +0530, Bharath Rupireddy wrote:
> I think changing BGWORKER_BYPASS_ALLOWCONN to 0x0001 and having bits32
> for InitPostgres inputs load_session_libraries and
> override_allow_connections can be 0001 in this patch series so that it
> can go first, no? This avoids the new code being in the old format and
> changing things right after it commits.

I am not completely sure what you mean here.  We've never supported
load_session_libraries for background workers, and I'm not quite sure
that there is a case for it.  FWIW, my idea around that would be to
have two separate sets of flags: one set for the bgworkers and one set
for PostgresInit() as an effect of load_session_libraries which looks
like a fuzzy concept for bgworkers.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Fri, Oct 06, 2023 at 09:09:10AM +0900, Michael Paquier wrote:
> I am not completely sure what you mean here.  We've never supported
> load_session_libraries for background workers, and I'm not quite sure
> that there is a case for it.  FWIW, my idea around that would be to
> have two separate sets of flags: one set for the bgworkers and one set
> for PostgresInit() as an effect of load_session_libraries which looks
> like a fuzzy concept for bgworkers.

I have applied v1 a few hours ago as of 991bb0f9653c.  Then, the
buildfarm has quickly complained that a bgworkers able to start with
BYPASS_ALLOWCONN while the database has its access restricted could
spawn workers that themselves try to connect to the database
restricted, causing the test to fail.  I have applied fd4d93d269c0 as
a quick way to avoid the spawn of workers in this case, and the
buildfarm has turned back to green.

Now, there's been a second type failure on serinus even after all
that:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2023-10-06%2001%3A04%3A05

The step running a `make check` on worker_spi in the run has worked:

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=serinus&dt=2023-10-06%2001%3A04%3A05&stg=module-worker_spi-check

But the follow-up step doing an installcheck with worker_spi has not.
And this looks like a crash:

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=serinus&dt=2023-10-06%2001%3A04%3A05&stg=testmodules-install-check-C

The logs reported by this step are not really helpful, as they contain
only information about the sql/ tests in the modules.  It is the first
time that this stuff is tested, so this could be a race condition
that's been around for some time but we've never seen it until now, or
it could be an issue in the test I fail to see.

Andres, are there logs for this TAP test on serinus?  Or perhaps there
is a core file that could be looked at?  The other animals are not
showing anything for the moment.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi,

On 10/5/23 6:23 PM, Bharath Rupireddy wrote:
> On Thu, Oct 5, 2023 at 9:32 PM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
>>
>> +  CREATE ROLE nologrole with nologin;
>> +  GRANT CREATE ON DATABASE mydb TO nologrole;
> 
> A few nit-picks:
> 
> 1. s/with/WITH
> 2. s/nologin/NOLOGIN

done in v8 attached.

> 3. +   is specified as <varname>flags</varname> it is possible to
> bypass the login check to connect to databases.
> How about "it is possible to bypass the login check for the role used
> to connect to databases."?
> 

"for the role used" sounds implicit to me but I don't have a strong opinion
about it so re-worded as per your proposal in v8.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Fri, Oct 06, 2023 at 03:29:28PM +0900, Michael Paquier wrote:
> Andres, are there logs for this TAP test on serinus?  Or perhaps there
> is a core file that could be looked at?  The other animals are not
> showing anything for the moment.

serinus has reported back once again, and just returned with a green
state, twice in a row:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2023-10-06%2007%3A42%3A53
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2023-10-06%2007%3A28%3A02

Well, it looks OK.  Still that's itching a bit.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Fri, Oct 06, 2023 at 03:29:28PM +0900, Michael Paquier wrote:
>> Andres, are there logs for this TAP test on serinus?  Or perhaps there
>> is a core file that could be looked at?  The other animals are not
>> showing anything for the moment.

> Well, it looks OK.  Still that's itching a bit.

There have been intermittent failures on various buildfarm machines
since this went in.  After seeing one on my own animal mamba [1],
I tried to reproduce it manually on that machine, and it does
indeed fail about one time in two.  The buildfarm script is not
managing to capture the relevant log files, but what I see in a
manual run is that 001_worker_spi.pl logs this:

...
# Postmaster PID for node "mynode" is 21897
[01:19:53.931](2.663s) ok 5 - bgworkers all launched
[01:19:54.711](0.780s) ok 6 - dynamic bgworkers all launched
error running SQL: 'psql:<stdin>:1: ERROR:  could not start background process
HINT:  More details may be available in the server log.'
while running 'psql -XAtq -d port=56393 host=/tmp/PETPK0Stwi dbname='postgres' -f - -v ON_ERROR_STOP=1' with sql
'SELECTworker_spi_launch(12, 16394, 16395);' at
/home/tgl/pgsql/src/test/modules/worker_spi/../../../../src/test/perl/PostgreSQL/Test/Cluster.pmline 2009. 
# Postmaster PID for node "mynode" is 21897
### Stopping node "mynode" using mode immediate
# Running: pg_ctl -D /home/tgl/pgsql/src/test/modules/worker_spi/tmp_check/t_001_worker_spi_mynode_data/pgdata -m
immediatestop 
waiting for server to shut down.... done
server stopped
# No postmaster PID for node "mynode"
[01:19:55.032](0.321s) # Tests were run but no plan was declared and done_testing() was not seen.
[01:19:55.035](0.002s) # Looks like your test exited with 29 just after 6.

and in the postmaster log

2023-10-08 01:19:54.265 EDT [5820] LOG:  worker_spi dynamic worker 10 initialized with schema10.counted
2023-10-08 01:19:54.378 EDT [27776] 001_worker_spi.pl LOG:  statement: SELECT worker_spi_launch(11, 5, 16395);
2023-10-08 01:19:54.476 EDT [18120] 001_worker_spi.pl LOG:  statement: SELECT datname, usename, wait_event FROM
pg_stat_activity
                    WHERE backend_type = 'worker_spi dynamic' AND
                    pid IN (5820, 428) ORDER BY datname;
2023-10-08 01:19:54.548 EDT [428] LOG:  worker_spi dynamic worker 11 initialized with schema11.counted
2023-10-08 01:19:54.680 EDT [152] 001_worker_spi.pl LOG:  statement: SELECT datname, usename, wait_event FROM
pg_stat_activity
                    WHERE backend_type = 'worker_spi dynamic' AND
                    pid IN (5820, 428) ORDER BY datname;
2023-10-08 01:19:54.779 EDT [1675] 001_worker_spi.pl LOG:  statement: ALTER DATABASE mydb ALLOW_CONNECTIONS false;
2023-10-08 01:19:54.854 EDT [26562] 001_worker_spi.pl LOG:  statement: SELECT worker_spi_launch(12, 16394, 16395);
2023-10-08 01:19:54.878 EDT [23636] FATAL:  database "mydb" is not currently accepting connections
2023-10-08 01:19:54.888 EDT [21897] LOG:  background worker "worker_spi dynamic" (PID 23636) exited with exit code 1
2023-10-08 01:19:54.888 EDT [26562] 001_worker_spi.pl ERROR:  could not start background process
2023-10-08 01:19:54.888 EDT [26562] 001_worker_spi.pl HINT:  More details may be available in the server log.
2023-10-08 01:19:54.888 EDT [26562] 001_worker_spi.pl STATEMENT:  SELECT worker_spi_launch(12, 16394, 16395);
2023-10-08 01:19:54.912 EDT [21897] LOG:  received immediate shutdown request
2023-10-08 01:19:55.014 EDT [21897] LOG:  database system is shut down

What it looks like to me is that there is a code path by which "could
not start background process" is reported as a failure of the SELECT
worker_spi_launch() query itself.  The test script is not expecting
that, because it executes that query with

# bgworker cannot be launched with connection restriction.
my $worker3_pid = $node->safe_psql('postgres',
   qq[SELECT worker_spi_launch(12, $mydb_id, $myrole_id);]);
$node->wait_for_log(
   qr/database "mydb" is not currently accepting connections/, $log_offset);

so safe_psql bails out and we get no further.

Since this only seems to happen on slow machines, I'd call it a timing
problem or race condition.  Unless you want to argue that the race
should not happen, probably the fix is to make the test script cope
with this worker_spi_launch() call failing.  As long as we see the
expected result from wait_for_log, we can be pretty sure the right
thing happened.

            regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2023-10-08%2001%3A00%3A22



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Sun, Oct 08, 2023 at 05:48:55PM -0400, Tom Lane wrote:
> There have been intermittent failures on various buildfarm machines
> since this went in.  After seeing one on my own animal mamba [1],
> I tried to reproduce it manually on that machine, and it does
> indeed fail about one time in two.  The buildfarm script is not
> managing to capture the relevant log files, but what I see in a
> manual run is that 001_worker_spi.pl logs this:

Thanks for the logs, I've noticed the failure but could not make any
sense of it based on the lack of information provided from the
buildfarm.  Serinus has complained once, for instance.

> Since this only seems to happen on slow machines, I'd call it a timing
> problem or race condition.  Unless you want to argue that the race
> should not happen, probably the fix is to make the test script cope
> with this worker_spi_launch() call failing.  As long as we see the
> expected result from wait_for_log, we can be pretty sure the right
> thing happened.

The trick to reproduce the failure is to slow down worker_spi_launch()
before WaitForBackgroundWorkerStartup() with a worker already
registered so as the worker has the time to start and exit because of
the ALLOW_CONNECTIONS restriction.  (SendPostmasterSignal() in
RegisterDynamicBackgroundWorker() interrupts a hardcoded sleep, so
I've just used an on-disk flag.)

Another thing is that we cannot rely on the PID returned by launch()
as it could fail, so $worker3_pid needs to disappear.  If we do that,
I'd rather just switch to a specific database for the tests with
ALLOWCONN rather than reuse "mydb" that could have other workers.  The
attached fixes the issue for me.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> Another thing is that we cannot rely on the PID returned by launch()
> as it could fail, so $worker3_pid needs to disappear.  If we do that,
> I'd rather just switch to a specific database for the tests with
> ALLOWCONN rather than reuse "mydb" that could have other workers.

Right.

> The
> attached fixes the issue for me.

Hmm.  This passed a few dozen test cycles on mamba's host,
but it seems to me there's still a race condition here:

$result = $node->safe_psql('postgres',
    "SELECT count(*) FROM pg_stat_activity WHERE datname = 'noconndb';");
is($result, '0', 'dynamic bgworker without BYPASS_ALLOWCONN not started');

There will be a window where the worker has logged "database
"noconndb" is not currently accepting connections" but hasn't yet
exited, so that conceivably this query could see a positive count.

We could just drop this test, reasoning that the appearance of
the error message is sufficient evidence that the right thing
happened.  (If the failed worker is still around, it won't break
the remaining tests AFAICS.)  Or we could convert this to a
poll_query_until() loop.

            regards, tom lane



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Mon, Oct 09, 2023 at 12:20:18PM -0400, Tom Lane wrote:
> There will be a window where the worker has logged "database
> "noconndb" is not currently accepting connections" but hasn't yet
> exited, so that conceivably this query could see a positive count.

I don't think that's possible here.  The check on datallowconn is done
before a backend calls pgstat_bestart() which would make its backend
entry reported to pg_stat_activity.  So there is no window where a
backend would be in pg_stat_activity if this check fails.

> We could just drop this test, reasoning that the appearance of
> the error message is sufficient evidence that the right thing
> happened.  (If the failed worker is still around, it won't break
> the remaining tests AFAICS.)  Or we could convert this to a
> poll_query_until() loop.

Saying that, I'm OK with just dropping this query, as it could also be
possible that one decides that calling pgstat_bestart() before the
datallowconn check is a good idea for a reason or another.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> On Mon, Oct 09, 2023 at 12:20:18PM -0400, Tom Lane wrote:
>> There will be a window where the worker has logged "database
>> "noconndb" is not currently accepting connections" but hasn't yet
>> exited, so that conceivably this query could see a positive count.

> I don't think that's possible here.  The check on datallowconn is done
> before a backend calls pgstat_bestart() which would make its backend
> entry reported to pg_stat_activity.  So there is no window where a
> backend would be in pg_stat_activity if this check fails.

Ah, right.  I complained after seeing that we set MyProc->databaseId
before doing CheckMyDatabase, but you're right that it doesn't
matter for pg_stat_activity until pgstat_bestart. 

> Saying that, I'm OK with just dropping this query, as it could also be
> possible that one decides that calling pgstat_bestart() before the
> datallowconn check is a good idea for a reason or another.

Not sure if that's a likely change or not.  However, if we're in
agreement that this test step isn't buying much, let's just drop
it and save the test cycles.

            regards, tom lane



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Mon, Oct 09, 2023 at 11:11:58PM -0400, Tom Lane wrote:
> Michael Paquier <michael@paquier.xyz> writes:
>> Saying that, I'm OK with just dropping this query, as it could also be
>> possible that one decides that calling pgstat_bestart() before the
>> datallowconn check is a good idea for a reason or another.
>
> Not sure if that's a likely change or not.  However, if we're in
> agreement that this test step isn't buying much, let's just drop
> it and save the test cycles.

No problem here.  f483b2090 has removed the query entirely, relying
now only on a wait_for_log() when the worker startup fails.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi,

On 10/6/23 8:48 AM, Drouvot, Bertrand wrote:
> Hi,
> 
> On 10/5/23 6:23 PM, Bharath Rupireddy wrote:
>> On Thu, Oct 5, 2023 at 9:32 PM Drouvot, Bertrand
>> <bertranddrouvot.pg@gmail.com> wrote:
>>>
>>> +  CREATE ROLE nologrole with nologin;
>>> +  GRANT CREATE ON DATABASE mydb TO nologrole;
>>
>> A few nit-picks:
>>
>> 1. s/with/WITH
>> 2. s/nologin/NOLOGIN
> 
> done in v8 attached.
> 
>> 3. +   is specified as <varname>flags</varname> it is possible to
>> bypass the login check to connect to databases.
>> How about "it is possible to bypass the login check for the role used
>> to connect to databases."?
>>
> 
> "for the role used" sounds implicit to me but I don't have a strong opinion
> about it so re-worded as per your proposal in v8.
> 

Please find attached v9 (v8 rebase due to f483b2090).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Tue, Oct 10, 2023 at 06:57:05AM +0200, Drouvot, Bertrand wrote:
> Please find attached v9 (v8 rebase due to f483b2090).

I was looking at v8 just before you sent this v9, and still got
annoyed by the extra boolean argument added to InitPostgres().  So
please let me propose to bite the bullet and refactor that, as of the
0001 attached that means less diff footprints in all the callers of
InitPostgres() (I am not wedded to the flag names).

It looks like 0002 had the same issues as f483b209: the worker that
could not be started because of the login restriction could be
detected as stopped by worker_spi_launch(), causing the script to fail
hard.

0002 is basically your v9, able to work with the refactoring from
0001.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi,

On 10/10/23 7:58 AM, Michael Paquier wrote:
> On Tue, Oct 10, 2023 at 06:57:05AM +0200, Drouvot, Bertrand wrote:
>> Please find attached v9 (v8 rebase due to f483b2090).
> 
> I was looking at v8 just before you sent this v9, and still got
> annoyed by the extra boolean argument added to InitPostgres(). 

Arf, I did not look at it as I had in mind to look at it once
this one is in.

> So
> please let me propose to bite the bullet and refactor that, as of the
> 0001 attached that means less diff footprints in all the callers of
> InitPostgres() (I am not wedded to the flag names).

Thanks for having looked at it!

+       bits32          init_flags = 0; /* never honor session_preload_libraries */

Also a few word about datallowconn in the comment? (as the flag deals with both).

> 
> It looks like 0002 had the same issues as f483b209: the worker that
> could not be started because of the login restriction could be
> detected as stopped by worker_spi_launch(), causing the script to fail
> hard.
> 
> 0002 is basically your v9, able to work with the refactoring from
> 0001.

Thanks!

  #define INIT_PG_OVERRIDE_ALLOW_CONNS   0x0002
+#define INIT_PG_BYPASS_ROLE_LOGIN              0x0004

Any reason why INIT_PG_BYPASS_ROLE_LOGIN is not 0x0003?

Except that it does look good to me.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:

On 10/10/23 9:12 AM, Drouvot, Bertrand wrote:
> Hi,
> Any reason why INIT_PG_BYPASS_ROLE_LOGIN is not 0x0003?
> 

Please forget about it ;-)

Regards,
-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Tue, Oct 10, 2023 at 09:18:04AM +0200, Drouvot, Bertrand wrote:
> Please forget about it ;-)

That's called an ENOCOFFEE :D
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
On 10/10/23 9:21 AM, Michael Paquier wrote:
> On Tue, Oct 10, 2023 at 09:18:04AM +0200, Drouvot, Bertrand wrote:
>> Please forget about it ;-)
> 
> That's called an ENOCOFFEE :D

Exactly, good one! ;-)

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Tue, Oct 10, 2023 at 09:12:49AM +0200, Drouvot, Bertrand wrote:
> On 10/10/23 7:58 AM, Michael Paquier wrote:
>> I was looking at v8 just before you sent this v9, and still got
>> annoyed by the extra boolean argument added to InitPostgres().
>
> Arf, I did not look at it as I had in mind to look at it once
> this one is in.

No problem.  I'm OK to do it.

>> So
>> please let me propose to bite the bullet and refactor that, as of the
>> 0001 attached that means less diff footprints in all the callers of
>> InitPostgres() (I am not wedded to the flag names).
>
> Thanks for having looked at it!
>
> +       bits32          init_flags = 0; /* never honor session_preload_libraries */
>
> Also a few word about datallowconn in the comment? (as the flag deals with both).

I am not sure that this is necessary in the code paths of
BackgroundWorkerInitializeConnectionByOid() and
BackgroundWorkerInitializeConnection() as datallowconn is handled a
few lines down.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Wed, Oct 11, 2023 at 08:26:42AM +0900, Michael Paquier wrote:
> On Tue, Oct 10, 2023 at 09:12:49AM +0200, Drouvot, Bertrand wrote:
> > On 10/10/23 7:58 AM, Michael Paquier wrote:
> >> I was looking at v8 just before you sent this v9, and still got
> >> annoyed by the extra boolean argument added to InitPostgres().
> >
> > Arf, I did not look at it as I had in mind to look at it once
> > this one is in.
>
> No problem.  I'm OK to do it.

Applied 0001 for now.

> I am not sure that this is necessary in the code paths of
> BackgroundWorkerInitializeConnectionByOid() and
> BackgroundWorkerInitializeConnection() as datallowconn is handled a
> few lines down.

 /* flags for InitPostgres() */
 #define INIT_PG_LOAD_SESSION_LIBS      0x0001
 #define INIT_PG_OVERRIDE_ALLOW_CONNS   0x0002
+#define INIT_PG_BYPASS_ROLE_LOGIN      0x0004

In 0002, I am not sure that this is the best name for this new flag.
There is consistency with the bgworker part, for sure, but shouldn't
we name that OVERRIDE_ROLE_LOGIN instead in miscadmin.h?
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi,

On 10/11/23 5:40 AM, Michael Paquier wrote:
> On Wed, Oct 11, 2023 at 08:26:42AM +0900, Michael Paquier wrote:
> 
>   /* flags for InitPostgres() */
>   #define INIT_PG_LOAD_SESSION_LIBS      0x0001
>   #define INIT_PG_OVERRIDE_ALLOW_CONNS   0x0002
> +#define INIT_PG_BYPASS_ROLE_LOGIN      0x0004
> 
> In 0002, I am not sure that this is the best name for this new flag.
> There is consistency with the bgworker part, for sure, but shouldn't
> we name that OVERRIDE_ROLE_LOGIN instead in miscadmin.h?

Yeah, agree. Changed in v12 attached.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Wed, Oct 11, 2023 at 08:48:04AM +0200, Drouvot, Bertrand wrote:
> Yeah, agree. Changed in v12 attached.

I have tweaked a few comments, and applied that.  Thanks.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
"Drouvot, Bertrand"
Дата:
Hi,

On 10/12/23 2:26 AM, Michael Paquier wrote:
> On Wed, Oct 11, 2023 at 08:48:04AM +0200, Drouvot, Bertrand wrote:
>> Yeah, agree. Changed in v12 attached.
> 
> I have tweaked a few comments, and applied that.  Thanks.

Oh and you also closed the CF entry, thanks!

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Thu, Oct 12, 2023 at 07:42:28AM +0200, Drouvot, Bertrand wrote:
> On 10/12/23 2:26 AM, Michael Paquier wrote:
>> I have tweaked a few comments, and applied that.  Thanks.
>
> Oh and you also closed the CF entry, thanks!

The buildfarm has provided some feedback, and the new tests have been
unstable on mamba:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2023-10-15%2005%3A04%3A21

In more details:
# poll_query_until timed out executing this query:
# SELECT datname, usename, wait_event FROM pg_stat_activity
#             WHERE backend_type = 'worker_spi dynamic' AND
#             pid = ;
# expecting this output:
# mydb|nologrole|WorkerSpiMain
# last actual query output:
#
# with stderr:
# ERROR:  syntax error at or near ";"
# LINE 3:             pid = ;

So this looks like a hard failure in starting the worker that should
bypass the role login check.  The logs don't offer much information,
but I think I know what's going on here: at this stage of the tests,
the number of workers created is 7, very close to the limit of
max_worker_processes, at 8 by default.  So one parallel worker spawned
by any of the other bgworkers would be enough to prevent the last one
to start, and mamba has been slow enough in the startup of the static
workers to show that this could be possible.

I think that we should just bump up max_worker_processes, like in the
attached.
--
Michael

Вложения

Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Tom Lane
Дата:
Michael Paquier <michael@paquier.xyz> writes:
> The buildfarm has provided some feedback, and the new tests have been
> unstable on mamba:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2023-10-15%2005%3A04%3A21

Yeah.  FWIW, I tried to reproduce this on mamba's host, but did not
see it again in 100 or so tries.

> In more details:
> # poll_query_until timed out executing this query:
> # SELECT datname, usename, wait_event FROM pg_stat_activity
> #             WHERE backend_type = 'worker_spi dynamic' AND
> #             pid = ;
> # with stderr:
> # ERROR:  syntax error at or near ";"
> # LINE 3:             pid = ;

> So this looks like a hard failure in starting the worker that should
> bypass the role login check.  The logs don't offer much information,
> but I think I know what's going on here: at this stage of the tests,
> the number of workers created is 7, very close to the limit of
> max_worker_processes, at 8 by default.  So one parallel worker spawned
> by any of the other bgworkers would be enough to prevent the last one
> to start, and mamba has been slow enough in the startup of the static
> workers to show that this could be possible.

I agree that that probably is the root cause, and we should fix it
by bumping up max_worker_processes in this test.

But this failure is annoying in another way.  Evidently,
worker_spi_launch returned NULL, which must have been from here:

    if (!RegisterDynamicBackgroundWorker(&worker, &handle))
        PG_RETURN_NULL();

Why in the world is that "return NULL", and not "ereport(ERROR)"
like all the other failure conditions in that function?

If there's actually a defensible reason for the C code to act
like that, then all the call sites need to have checks for
a null result.

(cc'ing Robert, as this coding originated at 090d0f205.)

            regards, tom lane



Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag

От
Michael Paquier
Дата:
On Sun, Oct 15, 2023 at 10:47:22PM -0400, Tom Lane wrote:
> I agree that that probably is the root cause, and we should fix it
> by bumping up max_worker_processes in this test.

Thanks.  I've fixed this one now.  Let's see if mamba is OK after
that.

> If there's actually a defensible reason for the C code to act
> like that, then all the call sites need to have checks for
> a null result.

We're just talking about a test module and an ERROR in the same
fashion as autoprewarm makes things more predictible for the TAP
script, IMO.
--
Michael

Вложения