Re: Slow standby snapshot
От | Michail Nikolaev |
---|---|
Тема | Re: Slow standby snapshot |
Дата | |
Msg-id | CANtu0oiPoSdQsjRd6Red5WMHi1E83d2+-bM9J6dtWR3c5Tap9g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Slow standby snapshot (Simon Riggs <simon.riggs@enterprisedb.com>) |
Список | pgsql-hackers |
Hello everyone. > However ... I tried to reproduce the original complaint, and > failed entirely. I do see KnownAssignedXidsGetAndSetXmin > eating a bit of time in the standby backends, but it's under 1% > and doesn't seem to be rising over time. Perhaps we've already > applied some optimization that ameliorates the problem? But > I tested v13 as well as HEAD, and got the same results. > Hmm. I wonder if my inability to detect a problem is because the startup > process does keep ahead of the workload on my machine, while it fails > to do so on the OP's machine. I've only got a 16-CPU machine at hand, > which probably limits the ability of the primary to saturate the standby's > startup process. Yes, optimization by Andres Freund made things much better, but the impact is still noticeable. I was also using 16CPU machine - but two of them (primary and standby). Here are the scripts I was using (1) for benchmark - maybe it could help. > Nowadays we've *got* those primitives. Can we get rid of > known_assigned_xids_lck, and if so would it make a meaningful > difference in this scenario? I was trying it already - but was unable to find real benefits for it. WIP patch in attachment. Hm, I see I have sent it to list, but it absent in archives... Just quote from it: > First potential positive effect I could see is > (TransactionIdIsInProgress -> KnownAssignedXidsSearch) locking but > seems like it is not on standby hotpath. > Second one - locking for KnownAssignedXidsGetAndSetXmin (build > snapshot). But I was unable to measure impact. It wasn’t visible > separately in (3) test. > Maybe someone knows scenario causing known_assigned_xids_lck or > TransactionIdIsInProgress become bottleneck on standby? The latest question is still actual :) > I think it might be a bigger effect than one might immediately think. Because > the spinlock will typically be on the same cacheline as head/tail, and because > every spinlock acquisition requires the cacheline to be modified (and thus > owned mexlusively) by the current core, uses of head/tail will very commonly > be cache misses even in workloads without a lot of KAX activity. I was trying to find some way to practically achieve any noticeable impact here, but unsuccessfully. >> But yeah, it does feel like the proposed >> approach is only going to be optimal over a small range of conditions. > In particular, it doesn't adapt at all to workloads that don't replay all that > much, but do compute a lot of snapshots. The approach (2) was optimized to avoid any additional work for anyone except non-startup processes (approach with offsets to skip gaps while building snapshot). [1]: https://gist.github.com/michail-nikolaev/e1dfc70bdd7cfd1b902523dbb3db2f28 [2]: https://www.postgresql.org/message-id/flat/CANtu0ogzo4MsR7My9%2BNhu3to5%3Dy7G9zSzUbxfWYOn9W5FfHjTA%40mail.gmail.com#341a3c3b033f69b260120b3173a66382 -- Michail Nikolaev
Вложения
В списке pgsql-hackers по дате отправления: