Обсуждение: hot standby in Postgresql 12
Preparing hot standby in PostgreSQL 12
I have created a standby.signal file and added the parameter hot_standby='on' in postgresql.conf file but i am still not able to query the standby database
Database is not getting open for reading.
postgres@ip-172-31-31-150:/u01/db10/pg_wal> psql -p 5490
psql: error: FATAL: the database system is starting up
psql: error: FATAL: the database system is starting up
Best Regards,
Ankush Chawla
Ankush Chawla
On Monday, January 18, 2021, Ankush Chawla <ankushchawla03@gmail.com> wrote:
Preparing hot standby in PostgreSQL 12I have created a standby.signal file and added the parameter hot_standby='on' in postgresql.conf file but i am still not able to query the standby databaseDatabase is not getting open for reading.postgres@ip-172-31-31-150:/u01/db10/pg_wal> psql -p 5490
psql: error: FATAL: the database system is starting up
The database is starting up, this is expected. You need to consult the server logs for relevant information pertaining to the startup/restore process.
David J.
log files ;
2021-01-18 07:31:00.946 UTC [21072]LOG: database system was interrupted; last known up at 2021-01-18 07:30:07 UTC
cp: cannot stat '/u01/archive/00000004.history': No such file or directory
2021-01-18 07:31:00.963 UTC [21072]LOG: entering standby mode
2021-01-18 07:31:00.966 UTC [21072]LOG: restored log file "00000003.history" from archive
2021-01-18 07:31:00.981 UTC [21072]LOG: restored log file "000000030000000000000034" from archive
2021-01-18 07:31:01.029 UTC [21072]LOG: redo starts at 0/34000028
2021-01-18 07:31:01.030 UTC [21072]LOG: consistent recovery state reached at 0/34000138
cp: cannot stat '/u01/archive/000000030000000000000035': No such file or directory
2021-01-18 07:31:01.043 UTC [21079]LOG: started streaming WAL from primary at 0/35000000 on timeline 3
2021-01-18 07:31:10.242 UTC postgres postgres [21082]FATAL: the database system is starting up
2021-01-18 07:53:42.915 UTC postgres postgres [21160]FATAL: the database system is starting up
2021-01-18 07:31:00.946 UTC [21072]LOG: database system was interrupted; last known up at 2021-01-18 07:30:07 UTC
cp: cannot stat '/u01/archive/00000004.history': No such file or directory
2021-01-18 07:31:00.963 UTC [21072]LOG: entering standby mode
2021-01-18 07:31:00.966 UTC [21072]LOG: restored log file "00000003.history" from archive
2021-01-18 07:31:00.981 UTC [21072]LOG: restored log file "000000030000000000000034" from archive
2021-01-18 07:31:01.029 UTC [21072]LOG: redo starts at 0/34000028
2021-01-18 07:31:01.030 UTC [21072]LOG: consistent recovery state reached at 0/34000138
cp: cannot stat '/u01/archive/000000030000000000000035': No such file or directory
2021-01-18 07:31:01.043 UTC [21079]LOG: started streaming WAL from primary at 0/35000000 on timeline 3
2021-01-18 07:31:10.242 UTC postgres postgres [21082]FATAL: the database system is starting up
2021-01-18 07:53:42.915 UTC postgres postgres [21160]FATAL: the database system is starting up
On Tue, Jan 19, 2021 at 9:01 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Monday, January 18, 2021, Ankush Chawla <ankushchawla03@gmail.com> wrote:Preparing hot standby in PostgreSQL 12I have created a standby.signal file and added the parameter hot_standby='on' in postgresql.conf file but i am still not able to query the standby databaseDatabase is not getting open for reading.postgres@ip-172-31-31-150:/u01/db10/pg_wal> psql -p 5490
psql: error: FATAL: the database system is starting upThe database is starting up, this is expected. You need to consult the server logs for relevant information pertaining to the startup/restore process.David J.
Best Regards,
Ankush Chawla
Ankush Chawla
On Monday, January 18, 2021, Ankush Chawla <ankushchawla03@gmail.com> wrote:
log files ;
2021-01-18 07:31:00.946 UTC [21072]LOG: database system was interrupted; last known up at 2021-01-18 07:30:07 UTC
cp: cannot stat '/u01/archive/00000004.history': No such file or directory
2021-01-18 07:31:00.963 UTC [21072]LOG: entering standby mode
2021-01-18 07:31:00.966 UTC [21072]LOG: restored log file "00000003.history" from archive
2021-01-18 07:31:00.981 UTC [21072]LOG: restored log file "000000030000000000000034" from archive
2021-01-18 07:31:01.029 UTC [21072]LOG: redo starts at 0/34000028
2021-01-18 07:31:01.030 UTC [21072]LOG: consistent recovery state reached at 0/34000138
cp: cannot stat '/u01/archive/000000030000000000000035': No such file or directory
2021-01-18 07:31:01.043 UTC [21079]LOG: started streaming WAL from primary at 0/35000000 on timeline 3
And now the standby is waiting for the next wal file to be archived by the primary. You should probably set archive_timeout to a non-zero value since the primary doesn’t seem very busy (though you also still need to do at least one write, empty wal files don’t get rotated out and archived.)
David J.
Ok, thanks.
But what is the logic behind that ... does the database only comes up after a switch
On Tue, Jan 19, 2021 at 9:18 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Monday, January 18, 2021, Ankush Chawla <ankushchawla03@gmail.com> wrote:log files ;
2021-01-18 07:31:00.946 UTC [21072]LOG: database system was interrupted; last known up at 2021-01-18 07:30:07 UTC
cp: cannot stat '/u01/archive/00000004.history': No such file or directory
2021-01-18 07:31:00.963 UTC [21072]LOG: entering standby mode
2021-01-18 07:31:00.966 UTC [21072]LOG: restored log file "00000003.history" from archive
2021-01-18 07:31:00.981 UTC [21072]LOG: restored log file "000000030000000000000034" from archive
2021-01-18 07:31:01.029 UTC [21072]LOG: redo starts at 0/34000028
2021-01-18 07:31:01.030 UTC [21072]LOG: consistent recovery state reached at 0/34000138
cp: cannot stat '/u01/archive/000000030000000000000035': No such file or directory
2021-01-18 07:31:01.043 UTC [21079]LOG: started streaming WAL from primary at 0/35000000 on timeline 3And now the standby is waiting for the next wal file to be archived by the primary. You should probably set archive_timeout to a non-zero value since the primary doesn’t seem very busy (though you also still need to do at least one write, empty wal files don’t get rotated out and archived.)David J.
Best Regards,
Ankush Chawla
Ankush Chawla
On Monday, January 18, 2021, Ankush Chawla <ankushchawla03@gmail.com> wrote:
Ok, thanks.But what is the logic behind that ... does the database only comes up after a switch
Don’t know...just reading the documentation and it says (I think) that the next segment is required before the server will accept connections.
David J.
Hi David, > On 19. Jan, 2021, at 04:48, David G. Johnston <david.g.johnston@gmail.com> wrote: > > And now the standby is waiting for the next wal file to be archived by the primary. You should probably set archive_timeoutto a non-zero value since the primary doesn’t seem very busy (though you also still need to do at least onewrite, empty wal files don’t get rotated out and archived.) you should be able to get around your problem by doing the following on the primary database: checkpoint; select pg_switch_wal(); The pg_switch_wal() alone will not trigger WAL file generation if no transaction has been going on. That's why I recommendforcing a checkpoint before that (same with Oracle by the way). As soon as a WAL is generated, it should get shipped to the standby database. Then it should open. Cheers, Paul
Still the same :
checkpoint;
select pg_switch_wal();
select pg_switch_wal();
loading few tables and did a switch as well , still the same, not able to connect to standby database
psql: error: FATAL: the database system is starting up
On Tue, Jan 19, 2021 at 11:18 AM Paul Förster <paul.foerster@gmail.com> wrote:
Hi David,
> On 19. Jan, 2021, at 04:48, David G. Johnston <david.g.johnston@gmail.com> wrote:
>
> And now the standby is waiting for the next wal file to be archived by the primary. You should probably set archive_timeout to a non-zero value since the primary doesn’t seem very busy (though you also still need to do at least one write, empty wal files don’t get rotated out and archived.)
you should be able to get around your problem by doing the following on the primary database:
checkpoint;
select pg_switch_wal();
The pg_switch_wal() alone will not trigger WAL file generation if no transaction has been going on. That's why I recommend forcing a checkpoint before that (same with Oracle by the way).
As soon as a WAL is generated, it should get shipped to the standby database. Then it should open.
Cheers,
Paul
Best Regards,
Ankush Chawla
Ankush Chawla
On Monday, January 18, 2021, Ankush Chawla <ankushchawla03@gmail.com> wrote:
Still the same :checkpoint;
select pg_switch_wal();loading few tables and did a switch as well , still the same, not able to connect to standby databasepsql: error: FATAL: the database system is starting up
And again, server log file?
David J.
Hey,
maybe your needed wal files were already archived and are not available anymore?
In this case you have to set wal_keep_size to a proper value to keep the needed wal files on your primary.
Best,
Dom
On 19.01.21 07:26, David G. Johnston wrote:
On Monday, January 18, 2021, Ankush Chawla <ankushchawla03@gmail.com> wrote:Still the same :checkpoint;
select pg_switch_wal();loading few tables and did a switch as well , still the same, not able to connect to standby databasepsql: error: FATAL: the database system is starting upAnd again, server log file?David J.
-- Dominik Lehnen Dev-Operations t. +49 (0)30 288 768 855 e. dominik.lehnen@gameduell.de GameDuell Bringing people together to have a good time with games. GameDuell GmbH, Taubenstraße 24-25, 10117 Berlin, Germany We are looking for people to join our team! http://inside.gameduell.com HAUPTSITZ DER GESELLSCHAFT / HEADQUARTERS: BERLIN, DEUTSCHLAND / GERMANY GESCHÄFTSFÜHRER / MANAGING DIRECTORS: KAI BOLIK & BORIS WASMUTH HANDELSREGISTER / COMMERCIAL REGISTER: AG BERLIN-CHARLOTTENBURG, HRB NR: 90891 UMSATZSTEUER-IDENTIFIKATIONSNUMMER / VAT ID: DE 813829498 (US Office: GameDuell Inc., 1156 Clement St., San Francisco, CA 94118, USA)
Looks like you have mentioned "hot_standby= on" on both places i.e recovery.conf and postgresql.conf. keep it at only one place and check.
On Tue, 19 Jan 2021, 8:51 am Ankush Chawla, <ankushchawla03@gmail.com> wrote:
Preparing hot standby in PostgreSQL 12I have created a standby.signal file and added the parameter hot_standby='on' in postgresql.conf file but i am still not able to query the standby databaseDatabase is not getting open for reading.postgres@ip-172-31-31-150:/u01/db10/pg_wal> psql -p 5490
psql: error: FATAL: the database system is starting up--Best Regards,
Ankush Chawla
it is there only in postgresql.conf .
I am usign postgresql 12 so not using recovery.conf but using standby.signal
On Tue, Jan 19, 2021 at 2:49 PM Ganesh Korde <ganeshakorde@gmail.com> wrote:
Looks like you have mentioned "hot_standby= on" on both places i.e recovery.conf and postgresql.conf. keep it at only one place and check.On Tue, 19 Jan 2021, 8:51 am Ankush Chawla, <ankushchawla03@gmail.com> wrote:Preparing hot standby in PostgreSQL 12I have created a standby.signal file and added the parameter hot_standby='on' in postgresql.conf file but i am still not able to query the standby databaseDatabase is not getting open for reading.postgres@ip-172-31-31-150:/u01/db10/pg_wal> psql -p 5490
psql: error: FATAL: the database system is starting up--Best Regards,
Ankush Chawla
Best Regards,
Ankush Chawla
Ankush Chawla
this was steps followed at standby site
1. Create file standby.signal
2. Edit postgresql.conf In Standby
hot_standby=on (off by default will not allow queries to run only for high availability)
primary_conninfo = 'user=postgres password=<masterdbpassword> host=172.31.37.228 port=5333'
primary_slot_name = 'standby'
recovery_target_timeline = 'latest'
3. Save and start the Standby
Pg_Ctl start –D <>-l logfile
On Tue, Jan 19, 2021 at 2:49 PM Ganesh Korde <ganeshakorde@gmail.com> wrote:
Looks like you have mentioned "hot_standby= on" on both places i.e recovery.conf and postgresql.conf. keep it at only one place and check.On Tue, 19 Jan 2021, 8:51 am Ankush Chawla, <ankushchawla03@gmail.com> wrote:Preparing hot standby in PostgreSQL 12I have created a standby.signal file and added the parameter hot_standby='on' in postgresql.conf file but i am still not able to query the standby databaseDatabase is not getting open for reading.postgres@ip-172-31-31-150:/u01/db10/pg_wal> psql -p 5490
psql: error: FATAL: the database system is starting up--Best Regards,
Ankush Chawla
Best Regards,
Ankush Chawla
Ankush Chawla
On Mon, 2021-01-18 at 20:48 -0700, David G. Johnston wrote: > On Monday, January 18, 2021, Ankush Chawla <ankushchawla03@gmail.com> wrote: > > log files ; > > > > > > 2021-01-18 07:31:00.946 UTC [21072]LOG: database system was interrupted; last known up at 2021-01-18 07:30:07 UTC > > cp: cannot stat '/u01/archive/00000004.history': No such file or directory > > 2021-01-18 07:31:00.963 UTC [21072]LOG: entering standby mode > > 2021-01-18 07:31:00.966 UTC [21072]LOG: restored log file "00000003.history" from archive > > 2021-01-18 07:31:00.981 UTC [21072]LOG: restored log file "000000030000000000000034" from archive > > 2021-01-18 07:31:01.029 UTC [21072]LOG: redo starts at 0/34000028 > > 2021-01-18 07:31:01.030 UTC [21072]LOG: consistent recovery state reached at 0/34000138 > > cp: cannot stat '/u01/archive/000000030000000000000035': No such file or directory > > 2021-01-18 07:31:01.043 UTC [21079]LOG: started streaming WAL from primary at 0/35000000 on timeline 3 > > > > And now the standby is waiting for the next wal file to be archived by the primary. > You should probably set archive_timeout to a non-zero value since the primary doesn’t > seem very busy (though you also still need to do at least one write, empty wal files > don’t get rotated out and archived.) If the standby is *streaming* WAL, it is *not* waiting for a WAL segment to be archived. WAL is streamed to the standby right away. What the standby needs to become available for connections are two events in the WAL stream: - the WAL entry that ends the base backup (BACKUP_END) - a WAL entry with the running transactions on the primary (RUNNING_XACTS) Since we see the message "consistent recovery state reached", the first condition is satisfied. So the standby is waiting for a RUNNING_XACTS. These get written every couple of seconds by the primary server if all is well. So it would be interesting to know what the primary is doing. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Tuesday, January 19, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > cp: cannot stat '/u01/archive/000000030000000000000035': No such file or directory
> > 2021-01-18 07:31:01.043 UTC [21079]LOG: started streaming WAL from primary at 0/35000000 on timeline 3
> >
>
> And now the standby is waiting for the next wal file to be archived by the primary.
> You should probably set archive_timeout to a non-zero value since the primary doesn’t
> seem very busy (though you also still need to do at least one write, empty wal files
> don’t get rotated out and archived.)
If the standby is *streaming* WAL, it is *not* waiting for a WAL segment to be archived.
WAL is streamed to the standby right away.
I took the error from the cp command to mean that this was in fact a warm standby regardless of any claims by the poster. I’ll readily admit I am not that familiar with this stuff though.
David J.
As David highlighted,
>> I took the error from the cp command to mean that this was in fact a warm standby regardless of any claims by the poster. I’ll readily admit I am not that familiar with this stuff though.
Seems your archive command is not working properly. Set it first and then try again.
Also, please share your postgresql.auto.conf ( remove confidential information like - password , username , IP etc. , if any)
Regards
Rajni
On Wed, Jan 20, 2021 at 2:11 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Tuesday, January 19, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
> > cp: cannot stat '/u01/archive/000000030000000000000035': No such file or directory
> > 2021-01-18 07:31:01.043 UTC [21079]LOG: started streaming WAL from primary at 0/35000000 on timeline 3
> >
>
> And now the standby is waiting for the next wal file to be archived by the primary.
> You should probably set archive_timeout to a non-zero value since the primary doesn’t
> seem very busy (though you also still need to do at least one write, empty wal files
> don’t get rotated out and archived.)
If the standby is *streaming* WAL, it is *not* waiting for a WAL segment to be archived.
WAL is streamed to the standby right away.I took the error from the cp command to mean that this was in fact a warm standby regardless of any claims by the poster. I’ll readily admit I am not that familiar with this stuff though.David J.
On Tue, Jan 19, 2021 at 2:56 PM Rajni Baliyan <saan654@gmail.com> wrote:
As David highlighted,>> I took the error from the cp command to mean that this was in fact a warm standby regardless of any claims by the poster. I’ll readily admit I am not that familiar with this stuff though.Seems your archive command is not working properly. Set it first and then try again.
Don't think so - as Laurenz noted the last line reads:
"2021-01-18 07:31:01.043 UTC [21079]LOG: started streaming WAL from primary at 0/35000000 on timeline 3"
Which indicates that streaming is supposedly working even though the expected data hasn't made it from the primary to the standby.
David J.
Yes David - the problem here is expected data hasn't made it from the primary to the standby even after streaming is started , and there could be many reasons for it.
I will be interested to know -
- How user created standby (pg_basebackup etc.)
- Was there any manual cleanup done on the standby pg_xlog directory?
- Is the walreceiver process running on standby? --> ps -ef | grep postgres
- Is the walsender process running on master ? --> ps -ef | grep postgres
- postgresql.auto.conf file
- Database system identifier for both master and standby ( use utility - pg_controldata -D <master/standby data dir> liek - pg_controldata -D /user/app/master)
I would do below to narrow down the issue-
- Stop postgres on standby or kill the postmaster process.
- Increase logging on standby
- Restart standby.
- Check standby log for status and error.
- Check connectivity between both primary and standby server- try connecting to primary as per credentials in primary_conninfo ( psql -h 172.31.37.228 -p 5333 -U postgres -d postgres)
- Check walreceiver process running on standby and the walsender process running on master.
- Checkpoint and switch wal on master
- Check logs and standby status
Share how above goes and also log files from both master and standby.
Regards
Rajni
On Wed, Jan 20, 2021 at 9:04 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Tue, Jan 19, 2021 at 2:56 PM Rajni Baliyan <saan654@gmail.com> wrote:As David highlighted,>> I took the error from the cp command to mean that this was in fact a warm standby regardless of any claims by the poster. I’ll readily admit I am not that familiar with this stuff though.Seems your archive command is not working properly. Set it first and then try again.Don't think so - as Laurenz noted the last line reads:"2021-01-18 07:31:01.043 UTC [21079]LOG: started streaming WAL from primary at 0/35000000 on timeline 3"Which indicates that streaming is supposedly working even though the expected data hasn't made it from the primary to the standby.David J.
Streaming is working as expected
Because when I promote the slave all the data has been replicated
Even I was also monitoring the pg_stat_replication and it is streaming the last wal file
On Wed, 20 Jan 2021 at 4:46 AM, Rajni Baliyan <saan654@gmail.com> wrote:
Yes David - the problem here is expected data hasn't made it from the primary to the standby even after streaming is started , and there could be many reasons for it.I will be interested to know -- How user created standby (pg_basebackup etc.)- Was there any manual cleanup done on the standby pg_xlog directory?- Is the walreceiver process running on standby? --> ps -ef | grep postgres- Is the walsender process running on master ? --> ps -ef | grep postgres- postgresql.auto.conf file- Database system identifier for both master and standby ( use utility - pg_controldata -D <master/standby data dir> liek - pg_controldata -D /user/app/master)I would do below to narrow down the issue-- Stop postgres on standby or kill the postmaster process.- Increase logging on standby- Restart standby.- Check standby log for status and error.- Check connectivity between both primary and standby server- try connecting to primary as per credentials in primary_conninfo ( psql -h 172.31.37.228 -p 5333 -U postgres -d postgres)- Check walreceiver process running on standby and the walsender process running on master.- Checkpoint and switch wal on master- Check logs and standby statusShare how above goes and also log files from both master and standby.RegardsRajniOn Wed, Jan 20, 2021 at 9:04 AM David G. Johnston <david.g.johnston@gmail.com> wrote:On Tue, Jan 19, 2021 at 2:56 PM Rajni Baliyan <saan654@gmail.com> wrote:As David highlighted,>> I took the error from the cp command to mean that this was in fact a warm standby regardless of any claims by the poster. I’ll readily admit I am not that familiar with this stuff though.Seems your archive command is not working properly. Set it first and then try again.Don't think so - as Laurenz noted the last line reads:"2021-01-18 07:31:01.043 UTC [21079]LOG: started streaming WAL from primary at 0/35000000 on timeline 3"Which indicates that streaming is supposedly working even though the expected data hasn't made it from the primary to the standby.David J.
Best Regards,
Ankush Chawla
Ankush Chawla