Re: [PATCH] Exponential backoff for auth_delay
От | Michael Banck |
---|---|
Тема | Re: [PATCH] Exponential backoff for auth_delay |
Дата | |
Msg-id | 65e62440.170a0220.f6a7c.6f7f@mx.google.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Exponential backoff for auth_delay (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Ответы |
Re: [PATCH] Exponential backoff for auth_delay
(Robert Haas <robertmhaas@gmail.com>)
Re: [PATCH] Exponential backoff for auth_delay (Nathan Bossart <nathandbossart@gmail.com>) |
Список | pgsql-hackers |
Hi, On Wed, Feb 21, 2024 at 10:26:11PM +0100, Tomas Vondra wrote: > Thanks for the patch. I took a closer look at v3, so let me share some > review comments. Please push back if you happen to disagree with some of > it, some of this is going to be more a matter of personal preference. Thanks! As my patch was based on a previous patch, some of decisions were carry-overs I am not overly attached to. > 1) I think it's a bit weird to have two options specifying amount of > time, but one is in seconds and one in milliseconds. Won't that be > unnecessarily confusing? Could we do both in milliseconds? Alright, I changed that. See below for a discussion about the GUCs in general. > 2) The C code defines the GUC as auth_delay.exponential_backoff, but the > SGML docs say <varname>auth_delay.exp_backoff</varname>. Right, an oversight from the last version where the GUC name got changed but I forgot to change the documentation, fixed. > 3) Do we actually need the exponential_backoff GUC? Wouldn't it be > sufficient to have the maximum value, and if it's -1 treat that as no > backoff? That is a good question, I guess that makes sense. The next question then is: what should the default for (now) auth_delay.max_milliseconds be in this case, -1? Or do we say that as the default for auth_delay.milliseconds is 0 anyway, why would somebody not want exponential backoff when they switch it on and keep it at the current 10s/10000ms)? I have not changed that for now, pending further input. > 4) I think the SGML docs should more clearly explain that the delay is > initialized to auth_delay.milliseconds, and reset to this value after > successful authentication. The wording kinda implies that, but it's not > quite clear I think. Ok, I added some text to that end. I also added a not that auth_delay.max_milliseconds will mean that the delay doubling never stops. > 4) I've been really confused by the change from > > if (status != STATUS_OK) > to > if (status == STATUS_ERROR) > > in auth_delay_checks, until I realized those two codes are not the only > ones, and we intentionally ignore STATUS_EOF. I think it'd be good to > mention that in a comment, it's not quite obvious (I only realized it > because the e-mail mentioned it). Yeah I agree, I tried to explain that now. > 5) I kinda like the custom that functions in a contrib module start with > a clear common prefix, say auth_delay_ in this case. Matter of personal > preference, ofc. Ok, I changed the functions to have an auth_delay_ prefix throughout.. > 6) I'm a bit skeptical about some acr_array details. Firstly, why would > 50 entries be enough? Seems like a pretty low and arbitrary number ... > Also, what happens if someone attempts to authenticate, fails to do so, > and then disconnects and never tries again? Or just changes IP? Seems > like the entry will remain in the array forever, no? Yeah, that is how v3 of this patch worked. I have changed that now, see below. > Seems like a way to cause a "limited" DoS - do auth failure from 50 > different hosts, to fill the table, and now everyone has to wait the > maximum amount of time (if they fail to authenticate). Right, though the problem would only exist on authentication failures, so it is really rather limited. > I think it'd be good to consider: > > - Making the acr_array a hash table, and larger than 50 entries (I > wonder if that should be dynamic / customizable by GUC?). I would say a GUC should be overkill for this as this would mostly be an implementation detail. More generally, I think now that entries are expired (see below), this should be less of a concern, so I have not changed this to a hash table for now but doubled MAX_CONN_RECORDS to 100 entries. > - Make sure the entries are eventually expired, based on time (for > example after 2*max_delay?). I went with 5*max_milliseconds - the AuthConnRecord struct now has a last_failed_auth timestamp member; if we increase the delay for a host, we check if any other host expired in the meantime and remove it if so. > - It would be a good idea to log something when we get into the "full > table" and start delaying everyone by max_delay_seconds. (How would > anyone know that's what's happening, seems rather confusing.) Right, I added a log line for that. However, I made it LOG instead of WARNING as I don't think the client would ever see it, would he? Attached is v4 with the above changes. Cheers, Michael
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Robert HaasДата:
Сообщение: Re: incremental backup mishandles XLOG_DBASE_CREATE_FILE_COPY