Обсуждение: AIX and EAGAIN on open()
Hello,
a customer running PG on AIX [1] is occasionally seeing "Resource
temporarily unavailable" (EAGAIN) returned by open() calls:
[1] We have PostgreSQL 11.13 on powerpc-ibm-aix7.2.5.0, compiled by /opt/IBM/xlc/13.1.0/bin/xlc, 64-bit
2022-05-19 03:28:13 CEST:127.0.0.1(63265):x@x:[64029168]: ERROR: could not open file "base/16401/935915821_fsm":
Resourcetemporarily unavailable
2022-05-19 03:28:13 CEST:127.0.0.1(63265):x@x:[64029168]: CONTEXT: SQL statement "INSERT INTO s[...]"
PL/pgSQL function s...() line 12 at SQL statement
2022-05-19 03:28:13 CEST:127.0.0.1(63265):x@x:[64029168]: STATEMENT: PREPARE ... AS insert into ...
2022-04-16 01:45:31 CEST:127.0.0.1(58946):x@x:[20906970]: ERROR: could not access status of transaction 0
2022-04-16 01:45:31 CEST:127.0.0.1(58946):x@x:[20906970]: DETAIL: Could not open file "pg_subtrans/6158": Resource
temporarilyunavailable.
2022-04-16 01:45:31 CEST:127.0.0.1(58946):x@x:[20906970]: STATEMENT: PREPARE ... AS update ...
2020-12-01 09:24:30 CET:127.0.0.1(59898):x@x:[6227520]: ERROR: could not access status of transaction 0
2020-12-01 09:24:30 CET:127.0.0.1(59898):x@x:[6227520]: DETAIL: Could not open file "pg_subtrans/AC9E": Resource
temporarilyunavailable.
2020-12-01 09:24:30 CET:127.0.0.1(59898):x@x:[6227520]: STATEMENT: PREPARE ... AS DELETE FROM ....
open() should not return EAGAIN as per POSIX [2],
[2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html#tag_16_357_05
and the AIX documentation says it would only return EAGAIN if O_TRUNC
is used [3], but as far as I can tell, PG does not use that flag.
[3]
https://www.ibm.com/docs/en/aix/7.2?topic=o-open-openat-openx-openxat-open64-open64at-open64x-open64xat-creat-creat64-subroutine
IBM's reply to the issue back in December 2020 was this:
The man page / infocenter document is not intended as an exhaustive
list of all possible error codes returned and their circumstances.
"Resource temporarily unavailable" may also be returned for
O_NSHARE, O_RSHARE with O_NONBLOCK.
Afaict, PG does not use these flags either.
We also ruled out that the system is using any anti-virus or similar
tooling that would intercept IO traffic.
Does anything of that ring a bell for someone? Is that an AIX bug, a
PG bug, or something else?
Christoph
--
Senior Consultant, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley
Unser Umgang mit personenbezogenen Daten unterliegt folgenden
Bestimmungen: https://www.credativ.de/datenschutz
On Mon, Jun 20, 2022 at 9:53 PM Christoph Berg <christoph.berg@credativ.de> wrote: > IBM's reply to the issue back in December 2020 was this: > > The man page / infocenter document is not intended as an exhaustive > list of all possible error codes returned and their circumstances. > "Resource temporarily unavailable" may also be returned for > O_NSHARE, O_RSHARE with O_NONBLOCK. > > Afaict, PG does not use these flags either. > > We also ruled out that the system is using any anti-virus or similar > tooling that would intercept IO traffic. > > Does anything of that ring a bell for someone? Is that an AIX bug, a > PG bug, or something else? No clue here. Anything unusual about the file system (NFS etc)? Can you truss/strace the system calls, to sanity check the flags arriving into open(), and see if there's any unexpected other activity around open() calls that might be coming from something you're linked against?
Re: Thomas Munro > > Does anything of that ring a bell for someone? Is that an AIX bug, a > > PG bug, or something else? > > No clue here. Anything unusual about the file system (NFS etc)? Can > you truss/strace the system calls, to sanity check the flags arriving > into open(), and see if there's any unexpected other activity around > open() calls that might be coming from something you're linked > against? Hi, it's local storage, 16Gb SAN, Unity 500 storage, all data is on SSD disks, and file system is JFS2 (mount options are rw,log=INLINE). Good point about the flags, but we don't have access to the servers, so not sure if it will be possible to retrieve strace information. I'll try asking. Thanks, Christoph -- Senior Consultant, Tel.: +49 2166 9901 187 credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley Unser Umgang mit personenbezogenen Daten unterliegt folgenden Bestimmungen: https://www.credativ.de/datenschutz