Обсуждение: Why is my Postgre server went in recovery mode all in sudden

Поиск
Список
Период
Сортировка

Why is my Postgre server went in recovery mode all in sudden

От
nikhil raj
Дата:
Hi Team,

I dont have any idea why did my Postgres server crash and it says

timestamp=2018-05-07 00:34:11.209 EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 10:08:19 EDT,SQL_state=00000LOG:  worker process: parallel worker for PID 2864 (PID 4476) exited with exit code 0
timestamp=2018-05-07 00:34:11.209 EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 10:08:19 EDT,SQL_state=00000LOG:  terminating any other active server processes
timestamp=2018-05-07 00:34:11.209 EDT,user=postgres,db=Ozalo,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:32:46 EDT,SQL_state=00000LOG:  duration: 0.331 ms  bind <unnamed>: SELECT "repository"."c_token" AS "token", "repository"."c_path" AS "path"

     FROM

    t_e20so1_repository AS "repository" INNER JOIN t_e20so1_document_bigint AS "documentbigint" ON  "repository"."c_repositoryid" = "documentbigint"."c_value" 

      WHERE  "documentbigint"."c_documentid" = 201989 

    AND  "documentbigint"."c_fieldid" = 'b035afc8-439f-4f2c-a9ae-e054052511fd'
timestamp=2018-05-07 00:34:11.210 EDT,user=postgres,db= Ozalo,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:32:46 EDT,SQL_state=00000LOG:  duration: 0.061 ms  execute <unnamed>: SELECT "repository"."c_token" AS "token", "repository"."c_path" AS "path"
 FROM
 t_zs01_sys_dm AS "repository" INNER JOIN t_e20so1_document_bigint AS "documentbigint" ON  "repository"."c_repositoryid" = "documentbigint"."c_value" 
      WHERE  "documentbigint"."c_documentid" = 201989 
    AND  "documentbigint"."c_fieldid" = 'b035afc8-439f-4f2c-a9ae-e054052511fd'


after Some time i was receiving this error


timestamp=2018-05-07 00:34:11.218 EDT,user=postgres,db= Ozalo ,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:34:10 EDT,SQL_state=57P02WARNING:  terminating connection because of crash of another server process
timestamp=2018-05-07 00:34:11.218 EDT,user=postgres,db= Ozalo ,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:34:10 EDT,SQL_state=57P02DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
timestamp=2018-05-07 00:34:11.218 EDT,user=postgres,db= Ozalo ,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:34:10 EDT,SQL_state=57P02HINT:  In a moment you should be able to reconnect to the database and repeat your command.

timestamp=2018-05-07 00:34:11.357 EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 10:08:20 EDT,SQL_state=57P02DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
timestamp=2018-05-07 00:34:11.357 EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 10:08:20 EDT,SQL_state=57P02HINT:  In a moment you should be able to reconnect to the database and repeat your command.
timestamp=2018-05-07 00:34:11.379 EDT,user=postgres,db= Ozalo,app=[unknown],,transaction-ID=0,session_start_timestamp=2018-05-07 00:34:11 EDT,SQL_state=57P03FATAL:  the database system is in recovery mode
timestamp=2018-05-07 00:34:11.381 EDT,user=postgres,db= Ozalo,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:34:11 EDT,SQL_state=57P03FATAL:  the database system is in recovery mode



what is the reason it corrupted  share memory ?

what is meant by The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. ?
 how much of share memory if its consume  it will crash

 Please can any one help me in this
or else what is the reason of crash of DB server

Current using 10.3

 Current Config

max_connections = 5000

shared_buffers = 7680MB

effective_cache_size = 23040MB
maintenance_work_mem = 1920MB
min_wal_size = 1GB
max_wal_size = 2GB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
max_worker_processes = 16
max_parallel_workers_per_gather = 8
max_parallel_workers = 16
work_mem = 196kB


Thanks


Re: Why is my Postgre server went in recovery mode all in sudden

От
Adrian Klaver
Дата:
On 05/09/2018 11:31 AM, nikhil raj wrote:
> Hi Team,
> 
> I dont have any idea why did my Postgres server crash and it says
> 
> timestamp=2018-05-07 00:34:11.209 
> EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 
> 10:08:19 EDT,SQL_state=00000LOG:  worker process: parallel worker for 
> PID 2864 (PID 4476) exited with exit code 0
> timestamp=2018-05-07 00:34:11.209 
> EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 
> 10:08:19 EDT,SQL_state=00000LOG:  terminating any other active server 
> processes
> timestamp=2018-05-07 00:34:11.209 
> EDT,user=postgres,db=Ozalo,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 
> 00:32:46 EDT,SQL_state=00000LOG:  duration: 0.331 ms  bind <unnamed>: 
> SELECT "repository"."c_token" AS "token", "repository"."c_path" AS "path"
> 
>       FROM
> 
>      t_e20so1_repository AS "repository" INNER JOIN 
> t_e20so1_document_bigint AS "documentbigint" ON  
> "repository"."c_repositoryid" = "documentbigint"."c_value"
> 
>        WHERE  "documentbigint"."c_documentid" = 201989
> 
>      AND  "documentbigint"."c_fieldid" = 
> 'b035afc8-439f-4f2c-a9ae-e054052511fd'
> timestamp=2018-05-07 00:34:11.210 EDT,user=postgres,db= 
> Ozalo,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 
> 00:32:46 EDT,SQL_state=00000LOG:  duration: 0.061 ms  execute <unnamed>: 
> SELECT "repository"."c_token" AS "token", "repository"."c_path" AS "path"
>   FROM
>   t_zs01_sys_dm AS "repository" INNER JOIN t_e20so1_document_bigint AS 
> "documentbigint" ON  "repository"."c_repositoryid" = 
> "documentbigint"."c_value"
>        WHERE  "documentbigint"."c_documentid" = 201989
>      AND  "documentbigint"."c_fieldid" = 
> 'b035afc8-439f-4f2c-a9ae-e054052511fd'
> 
> 
> after Some time i was receiving this error
> 
> 
> timestamp=2018-05-07 00:34:11.218 EDT,user=postgres,db= Ozalo 
> ,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 
> 00:34:10 EDT,SQL_state=57P02WARNING:  terminating connection because of 
> crash of another server process
> timestamp=2018-05-07 00:34:11.218 EDT,user=postgres,db= Ozalo 
> ,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 
> 00:34:10 EDT,SQL_state=57P02DETAIL:  The postmaster has commanded this 
> server process to roll back the current transaction and exit, because 
> another server process exited abnormally and possibly corrupted shared 
> memory.
> timestamp=2018-05-07 00:34:11.218 EDT,user=postgres,db= Ozalo 
> ,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 
> 00:34:10 EDT,SQL_state=57P02HINT:  In a moment you should be able to 
> reconnect to the database and repeat your command.
> 
> timestamp=2018-05-07 00:34:11.357 
> EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 
> 10:08:20 EDT,SQL_state=57P02DETAIL:  The postmaster has commanded this 
> server process to roll back the current transaction and exit, because 
> another server process exited abnormally and possibly corrupted shared 
> memory.
> timestamp=2018-05-07 00:34:11.357 
> EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 
> 10:08:20 EDT,SQL_state=57P02HINT:  In a moment you should be able to 
> reconnect to the database and repeat your command.
> timestamp=2018-05-07 00:34:11.379 EDT,user=postgres,db= 
> Ozalo,app=[unknown],,transaction-ID=0,session_start_timestamp=2018-05-07 
> 00:34:11 EDT,SQL_state=57P03FATAL:  the database system is in recovery mode
> timestamp=2018-05-07 00:34:11.381 EDT,user=postgres,db= 
> Ozalo,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 
> 00:34:11 EDT,SQL_state=57P03FATAL:  the database system is in recovery mode
> 
> 
> 
> what is the reason it corrupted share memory ?

At this point I don't know. More information is required:

1) What OS and version?

2) System memory size is ?

3) Where was Postgres installed from?

4) What is session_start_timestamp tracking? In other words what does it 
match up to here:

https://www.postgresql.org/docs/10/static/runtime-config-logging.html

log_line_prefix

5) What is the process that started 2018-04-26 10:08:19?

6) The query in the log started at 2018-05-07 00:32:46, what is it doing?


> 
> what is meant by The postmaster has commanded this server process to 
> roll back the current transaction and exit, because another server 
> process exited abnormally and possibly corrupted shared memory. ?
>   how much of share memory if its consume  it will crash
> 
>   Please can any one help me in this
> or else what is the reason of crash of DB server
> 
> Current using 10.3
> |
>   Current Config
> 
> max_connections = 5000||
> shared_buffers = 7680MB ||||
> effective_cache_size = 23040MB |||
> 
> |maintenance_work_mem = 1920MB min_wal_size = 1GB max_wal_size = 2GB 
> checkpoint_completion_target = 0.7 wal_buffers = 16MB 
> default_statistics_target = 100 random_page_cost = 1.1 
> effective_io_concurrency = 200 max_worker_processes = 16 
> max_parallel_workers_per_gather = 8 max_parallel_workers = 16 work_mem = 
> 196kB|
> 
> 
> 
> Thanks
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Why is my Postgre server went in recovery mode all in sudden

От
nikhil raj
Дата:
1) What OS and version?

ans: windows 2012R2

2) System memory size is ?

ans: 32GB

3) What is session_start_timestamp tracking? In other words what does it match up to here:
ans: This is the format to
timestamp=%m,user=%u,db=%d,app=%a,client=%h,transaction-ID=%x,session_start_timestamp=%s,SQL_state=%e
  The time when the session  is session_started _start_timestamp

4) What is the process that started 2018-04-26 10:08:19?

ans:  Its an backed process of Postgre started  parallel worker

5) The query in the log started at 2018-05-07 00:32:46, what is it doing?
ans : query running from an agent if any processing is going on the front end some of the query will run

On Thu, May 10, 2018 at 5:10 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 05/09/2018 11:31 AM, nikhil raj wrote:
Hi Team,

I dont have any idea why did my Postgres server crash and it says

timestamp=2018-05-07 00:34:11.209 EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 10:08:19 EDT,SQL_state=00000LOG:  worker process: parallel worker for PID 2864 (PID 4476) exited with exit code 0
timestamp=2018-05-07 00:34:11.209 EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 10:08:19 EDT,SQL_state=00000LOG:  terminating any other active server processes
timestamp=2018-05-07 00:34:11.209 EDT,user=postgres,db=Ozalo,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:32:46 EDT,SQL_state=00000LOG:  duration: 0.331 ms  bind <unnamed>: SELECT "repository"."c_token" AS "token", "repository"."c_path" AS "path"

      FROM

     t_e20so1_repository AS "repository" INNER JOIN t_e20so1_document_bigint AS "documentbigint" ON  "repository"."c_repositoryid" = "documentbigint"."c_value"

       WHERE  "documentbigint"."c_documentid" = 201989

     AND  "documentbigint"."c_fieldid" = 'b035afc8-439f-4f2c-a9ae-e054052511fd'
timestamp=2018-05-07 00:34:11.210 EDT,user=postgres,db= Ozalo,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:32:46 EDT,SQL_state=00000LOG:  duration: 0.061 ms  execute <unnamed>: SELECT "repository"."c_token" AS "token", "repository"."c_path" AS "path"
  FROM
  t_zs01_sys_dm AS "repository" INNER JOIN t_e20so1_document_bigint AS "documentbigint" ON  "repository"."c_repositoryid" = "documentbigint"."c_value"
       WHERE  "documentbigint"."c_documentid" = 201989
     AND  "documentbigint"."c_fieldid" = 'b035afc8-439f-4f2c-a9ae-e054052511fd'


after Some time i was receiving this error


timestamp=2018-05-07 00:34:11.218 EDT,user=postgres,db= Ozalo ,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:34:10 EDT,SQL_state=57P02WARNING:  terminating connection because of crash of another server process
timestamp=2018-05-07 00:34:11.218 EDT,user=postgres,db= Ozalo ,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:34:10 EDT,SQL_state=57P02DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
timestamp=2018-05-07 00:34:11.218 EDT,user=postgres,db= Ozalo ,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:34:10 EDT,SQL_state=57P02HINT:  In a moment you should be able to reconnect to the database and repeat your command.

timestamp=2018-05-07 00:34:11.357 EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 10:08:20 EDT,SQL_state=57P02DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
timestamp=2018-05-07 00:34:11.357 EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 10:08:20 EDT,SQL_state=57P02HINT:  In a moment you should be able to reconnect to the database and repeat your command.
timestamp=2018-05-07 00:34:11.379 EDT,user=postgres,db= Ozalo,app=[unknown],,transaction-ID=0,session_start_timestamp=2018-05-07 00:34:11 EDT,SQL_state=57P03FATAL:  the database system is in recovery mode
timestamp=2018-05-07 00:34:11.381 EDT,user=postgres,db= Ozalo,app=[unknown],transaction-ID=0,session_start_timestamp=2018-05-07 00:34:11 EDT,SQL_state=57P03FATAL:  the database system is in recovery mode



what is the reason it corrupted share memory ?

At this point I don't know. More information is required:

1) What OS and version?

2) System memory size is ?

3) Where was Postgres installed from?

4) What is session_start_timestamp tracking? In other words what does it match up to here:

https://www.postgresql.org/docs/10/static/runtime-config-logging.html

log_line_prefix

5) What is the process that started 2018-04-26 10:08:19?

6) The query in the log started at 2018-05-07 00:32:46, what is it doing?




what is meant by The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. ?
  how much of share memory if its consume  it will crash

  Please can any one help me in this
or else what is the reason of crash of DB server

Current using 10.3
|
  Current Config

max_connections = 5000||
shared_buffers = 7680MB ||||
effective_cache_size = 23040MB |||

|maintenance_work_mem = 1920MB min_wal_size = 1GB max_wal_size = 2GB checkpoint_completion_target = 0.7 wal_buffers = 16MB default_statistics_target = 100 random_page_cost = 1.1 effective_io_concurrency = 200 max_worker_processes = 16 max_parallel_workers_per_gather = 8 max_parallel_workers = 16 work_mem = 196kB|



Thanks




--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Why is my Postgre server went in recovery mode all in sudden

От
Adrian Klaver
Дата:
On 05/10/2018 04:42 AM, nikhil raj wrote:
> 1) What OS and version?
> 
> ans: windows 2012R2
> 
> 2) System memory size is ?
> 
> ans: 32GB
> 
> 3) What is session_start_timestamp tracking? In other words what does it 
> match up to here:
> ans: This is the format to
> timestamp=%m,user=%u,db=%d,app=%a,client=%h,transaction-ID=%x,session_start_timestamp=%s,SQL_state=%e
>    The time when the session  is session_started _start_timestamp
> 
> 4) What is the process that started 2018-04-26 10:08:19?
> 
> ans:  Its an backed process of Postgre started parallel worker
> 
> 5) The query in the log started at 2018-05-07 00:32:46, what is it doing?
> ans : query running from an agent if any processing is going on the 
> front end some of the query will run

So there is process that opens a connection, leaves it open and then 
periodically runs queries?

Has the database shutdown happened again?




-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Why is my Postgre server went in recovery mode all in sudden

От
nikhil raj
Дата:
Hi Adrian Klaver,

No its like it opens  connection and once the query gets execute it goes to idle connection and again idle connection time out i kept it for 2 mins .

It never happen again.

need few answers
why will share buffer crashes ?

Thanks,



On Thu, May 10, 2018 at 7:28 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 05/10/2018 04:42 AM, nikhil raj wrote:
1) What OS and version?

ans: windows 2012R2

2) System memory size is ?

ans: 32GB

3) What is session_start_timestamp tracking? In other words what does it match up to here:
ans: This is the format to
timestamp=%m,user=%u,db=%d,app=%a,client=%h,transaction-ID=%x,session_start_timestamp=%s,SQL_state=%e
   The time when the session  is session_started _start_timestamp

4) What is the process that started 2018-04-26 10:08:19?

ans:  Its an backed process of Postgre started parallel worker

5) The query in the log started at 2018-05-07 00:32:46, what is it doing?
ans : query running from an agent if any processing is going on the front end some of the query will run

So there is process that opens a connection, leaves it open and then periodically runs queries?

Has the database shutdown happened again?




--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Why is my Postgre server went in recovery mode all in sudden

От
Adrian Klaver
Дата:
On 05/11/2018 03:30 AM, nikhil raj wrote:
> Hi Adrian Klaver,
> 
> No its like it opens  connection and once the query gets execute it goes 
> to idle connection and again idle connection time out i kept it for 2 mins .
> 
> It never happen again.
> 
> need few answers
> why will share buffer crashes ?

 From the error message it was maybe shared memory:

"The postmaster has commanded this server process to roll back the 
current transaction and exit, because another server process exited 
abnormally and possibly corrupted shared memory."

I don't have an answer as it is not clear to me what process triggered 
the exit. My suspicion is it was this:

"timestamp=2018-05-07 00:34:11.209 
EDT,user=,db=,app=,client=,transaction-ID=0,session_start_timestamp=2018-04-26 
10:08:19 EDT,SQL_state=00000LOG:  worker process: parallel worker for 
PID 2864 (PID 4476) exited with exit code 0"

To me it looks like a parallel query process that ran for 11 days.
Is that possible and do you know what the query was doing?

> 
> Thanks,
> 
> 
> 



-- 
Adrian Klaver
adrian.klaver@aklaver.com