BUG #17392: archiver process exited with exit code 2 was unexpectedly cause for immediate shutdown request
От | PG Bug reporting form |
---|---|
Тема | BUG #17392: archiver process exited with exit code 2 was unexpectedly cause for immediate shutdown request |
Дата | |
Msg-id | 17392-ae1e272049dfec87@postgresql.org обсуждение исходный текст |
Список | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 17392 Logged by: Alexander Ulaev Email address: alexander.ulaev@rtlabs.ru PostgreSQL version: Unsupported/Unknown Operating system: CentOS Linux release 7.9.2009 (Core) Description: We have a some shards with patroni cluster over PG9.6 installed on VMs Some problems on SAN side follow our kvm VMs was halted during 1 min approximately by I\O disability and most of db shards with relatively low load had a 40-60 seconds commits, but was survived but two shard's masters with high application load (TPS = 2-3x from AVG among the shard DBs) was unexpectedly shutdowned with the same errors: 2022-02-01 16:12:24 MSK [16959] LOG: received immediate shutdown request and 2022-02-01 16:12:25 MSK [16959] LOG: archiver process (PID 117615) exited with exit code 2 (I suppose this timestamp for LOG is incorrect and this record really stands behind "shutdown" record by meaning) among huge number of messages for user process "terminating connection because of crash of another server process" like these 2022-02-01 16:12:24 MSK [151045] 127.0.0.1 PostgreSQL JDBC Driver queue2@queue2 HINT: In a moment you should be able to reconnect to the database and repeat your command. 2022-02-01 16:12:24 MSK [152240] 127.0.0.1 PostgreSQL JDBC Driver queue2@queue2 WARNING: terminating connection because of crash of another server process 2022-02-01 16:12:24 MSK [152240] 127.0.0.1 PostgreSQL JDBC Driver queue2@queue2 DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. I can't find anywhere what do this exit code 2 stand for and as I know of the behavior of ARCHIVER process on abnormal termination it had to be restarted by the postmaster, but no "entire instance is terminated abnormally", or "All of the postgres process halts" I found in source code that all functions relating to archiver are included in pgarch.c having initial author: Simon Riggs simon@2ndquadrant.com, but I cant found there any information related to "exit code 2" Later when instance was starting and recovering wal logs since the last checkpoint, then "invalid record length" arise: 2022-02-01 16:13:17 MSK [153401] LOG: invalid record length at 1CEC/C3BEBB50: wanted 24, got 0 2022-02-01 16:13:17 MSK [153401] LOG: consistent recovery state reached at 1CEC/C3BEBB50 2022-02-01 16:13:17 MSK [153397] LOG: database system is ready to accept read only connections but instance was started and patroni return it to master role, because sync replica also was shutdowned by "invalid record length" when applied wal logs 2022-02-01 16:12:25 MSK [16563] 127.0.0.1 [unknown] patroni@postgres LOG: connection authorized: user=patroni database=postgres WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. 2022-02-01 16:12:25 MSK [89015] FATAL: could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. 2022-02-01 16:12:25 MSK [89009] LOG: invalid record length at 1CEC/C3BEBA90: wanted 24, got 0 2022-02-01 16:12:25 MSK [16564] FATAL: could not connect to the primary server: FATAL: the database system is shutting down 2022-02-01 16:12:26 MSK [89006] LOG: received fast shutdown request 2022-02-01 16:12:26 MSK [89006] LOG: aborting any active transactions
В списке pgsql-bugs по дате отправления: