Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
От | Thomas Munro |
---|---|
Тема | Re: lockup in parallel hash join on dikkop (freebsd 14.0-current) |
Дата | |
Msg-id | CA+hUKG+YkAnOLrKKcy-FLjoVUV3r=L+c28gzMSL58Cv9jC4nvg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: lockup in parallel hash join on dikkop (freebsd 14.0-current) (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: lockup in parallel hash join on dikkop (freebsd 14.0-current)
|
Список | pgsql-hackers |
After 1000 make check loops, and 1000 make -C src/test/modules/test_shm_mq check loops, on the same FBSD 13.1 machine as elver which has failed like this once before, I haven't been able to reproduce this on REL_12_STABLE. Not really sure how to chase this, but if you see this situation again, I'd been interested to see the output of fstat -p PID (shows bytes in pipes) and procstat -j PID (shows pending signals) for all PIDs involved (before connecting a debugger or doing anything else that might make it return with EINTR, after which we know it continues happily because it then sees latch->is_set next time around the loop). If poll() is not returning when there are bytes ready to read from the self-pipe, which fstat can show, I think that'd indicate a kernel bug. If procstat -j shows signals pending but somehow it's still blocked in the syscall. Otherwise, it might indicate a compiler or postgres bug, but I don't have any particular theories.
В списке pgsql-hackers по дате отправления: