Re: Non-reproducible AIO failure
От | Thomas Munro |
---|---|
Тема | Re: Non-reproducible AIO failure |
Дата | |
Msg-id | CA+hUKG+kCOZbsiL7Qc=_1Ahd=JdAkrq0VnStrUvLEnky-H7yUA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Non-reproducible AIO failure (Alexander Lakhin <exclusion@gmail.com>) |
Ответы |
Re: Non-reproducible AIO failure
|
Список | pgsql-hackers |
On Sun, May 25, 2025 at 9:00 AM Alexander Lakhin <exclusion@gmail.com> wrote: > Hello Thomas, > 24.05.2025 14:42, Thomas Munro wrote: > > On Sat, May 24, 2025 at 3:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> So it seems that "very low-probability issue in our Mac AIO code" is > >> the most probable description. > > There isn't any macOS-specific AIO code so my first guess would be > > that it might be due to aarch64 weak memory reordering (though Andres > > speculated that itt should all be one backend, huh), if it's not just > > a timing luck thing. Alexander, were the other OSes you tried all on > > x86? > > As I wrote off-list before, I had tried x86_64 only, but since then I > tried to reproduce the issue on an aarch64 server with Ubuntu 24.04, > running 10, then 40 instances of t/027_stream_regress.pl in parallel. I've > also multiplied "test: brin ..." line x10. But the issue is still not > reproduced (in 8+ hours). Hmm. And I see now that this really is all in one backend. Could it be some variation of the interrupt processing stuff from acad9093? > However, I've managed to get an AIO-related assertion failure on macOS 14.5 ... > TRAP: failed Assert("ioh->op == PGAIO_OP_INVALID"), File: "aio_io.c", Line: 161, PID: 32355 Can you get a core and print *ioh in the debugger?
В списке pgsql-hackers по дате отправления: