Re: Improving connection scalability: GetSnapshotData()
От | Andres Freund |
---|---|
Тема | Re: Improving connection scalability: GetSnapshotData() |
Дата | |
Msg-id | 20200816190012.nqzmtiaju6ndckb2@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: Improving connection scalability: GetSnapshotData() (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On 2020-08-16 14:30:24 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > 690 successful runs later, it didn't trigger for me :(. Seems pretty > > clear that there's another variable than pure chance, otherwise it seems > > like that number of runs should have hit the issue, given the number of > > bf hits vs bf runs. > > It seems entirely likely that there's a timing component in this, for > instance autovacuum coming along at just the right time. It's not too > surprising that some machines would be more prone to show that than > others. (Note peripatus is FreeBSD, which we've already learned has > significantly different kernel scheduler behavior than Linux.) Yea. Interestingly there was a reproduction on linux since the initial reports you'd dug up: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=butterflyfish&dt=2020-08-15%2019%3A54%3A53 but that's likely a virtualized environment, so I guess the host scheduler behaviour could play a similar role. I'll run a few iterations with rr's chaos mode too, which tries to randomize scheduling decisions... I noticed that it's quite hard to actually hit the hot tuple path I mentioned earlier on my machine. Would probably be good to have a tests hitting it more reliably. But I'm not immediately seeing how we could force the necessarily serialization. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: