Re: Recent 027_streaming_regress.pl hangs

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: Recent 027_streaming_regress.pl hangs
Дата	26 марта 2024 г. 03:56:07
Msg-id	20240326035607.grqoyrxjvpyhnkrf@awork3.anarazel.de обсуждение исходный текст
Ответ на	Re: Recent 027_streaming_regress.pl hangs (Andres Freund <andres@anarazel.de>)
Ответы	Re: Recent 027_streaming_regress.pl hangs
Список	pgsql-hackers

Дерево обсуждения

Hi,

On 2024-03-20 17:41:45 -0700, Andres Freund wrote:
> On 2024-03-14 16:56:39 -0400, Tom Lane wrote:
> > Also, this is probably not
> > helping anything:
> >
> >                    'extra_config' => {
> >                                                       ...
> >                                                       'fsync = on'
>
> At some point we had practically no test coverage of fsync, so I made my
> animals use fsync. I think we still have little coverage.  I probably could
> reduce the number of animals using it though.

I think there must be some actual regression involved. The frequency of
failures on HEAD vs failures on 16 - both of which run the tests concurrently
via meson - is just vastly different.  I'd expect the absolute number of
failures in 027_stream_regress.pl to differ between branches due to fewer runs
on 16, but there's no explanation for the difference in percentage of
failures. My menagerie had only a single recoveryCheck failure on !HEAD in the
last 30 days, but in the vicinity of 100 on HEAD
https://buildfarm.postgresql.org/cgi-bin/show_failures.pl?max_days=30&stage=recoveryCheck&filter=Submit

If anything the load when testing back branch changes is higher, because
commonly back-branch builds are happening on all branches, so I don't think
that can be the explanation either.

From what I can tell the pattern changed on 2024-02-16 19:39:02 - there was a
rash of recoveryCheck failures in the days before that too, but not
027_stream_regress.pl in that way.

It certainly seems suspicious that one commit before the first observed failure
is
2024-02-16 11:09:11 -0800 [73f0a132660] Pass correct count to WALRead().

Of course the failure rate is low enough that it could have been a day or two
before that, too.

Greetings,

Andres Freund

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Recent 027_streaming_regress.pl hangs