Re: Issue with the PRNG used by Postgres
От | Tom Lane |
---|---|
Тема | Re: Issue with the PRNG used by Postgres |
Дата | |
Msg-id | 4085126.1712769141@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Issue with the PRNG used by Postgres (Parag Paul <parag.paul@gmail.com>) |
Список | pgsql-hackers |
Parag Paul <parag.paul@gmail.com> writes: > Yes, the probability of this happening is astronomical, but in production > with 128 core servers with 7000 max_connections, with petabyte scale data, > this did repro 2 times in the last month. We had to move to a local > approach to manager our ratelimiting counters. > This is not reproducible very easily. I feel that we should at least shield > ourselves with the following change, so that we at least increase the delay > by 1000us every time. We will follow a linear back off, but better than no > backoff. I still say you are proposing to band-aid the wrong thing. Moreover: * the proposed patch will cause the first few cur_delay values to grow much faster than before, with direct performance impact to everyone, whether they are on 128-core servers or not; * if we are in a regime where xoroshiro repeatedly returns zero across multiple backends, your patch doesn't improve the situation AFAICS, because the backends will still choose the same series of cur_delay values and thus continue to exhibit thundering-herd behavior. Indeed, as coded I think the patch makes it *more* likely that the same series of cur_delay values would be chosen by multiple backends. regards, tom lane
В списке pgsql-hackers по дате отправления: