Re: Improving connection scalability: GetSnapshotData()

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: Improving connection scalability: GetSnapshotData()
Дата	16 августа 2020 г. 19:00:12
Msg-id	20200816190012.nqzmtiaju6ndckb2@alap3.anarazel.de обсуждение исходный текст
Ответ на	Re: Improving connection scalability: GetSnapshotData() (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения

On 2020-08-16 14:30:24 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > 690 successful runs later, it didn't trigger for me :(. Seems pretty
> > clear that there's another variable than pure chance, otherwise it seems
> > like that number of runs should have hit the issue, given the number of
> > bf hits vs bf runs.
> 
> It seems entirely likely that there's a timing component in this, for
> instance autovacuum coming along at just the right time.  It's not too
> surprising that some machines would be more prone to show that than
> others.  (Note peripatus is FreeBSD, which we've already learned has
> significantly different kernel scheduler behavior than Linux.)

Yea. Interestingly there was a reproduction on linux since the initial
reports you'd dug up:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=butterflyfish&dt=2020-08-15%2019%3A54%3A53

but that's likely a virtualized environment, so I guess the host
scheduler behaviour could play a similar role.

I'll run a few iterations with rr's chaos mode too, which tries to
randomize scheduling decisions...

I noticed that it's quite hard to actually hit the hot tuple path I
mentioned earlier on my machine. Would probably be good to have a tests
hitting it more reliably. But I'm not immediately seeing how we could
force the necessarily serialization.

Greetings,

Andres Freund

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Improving connection scalability: GetSnapshotData()