Обсуждение: WAL restore is very slow
Hi
6. wal_log_hints = on
We have PG-14 with a huge data set of 14 TB running on r5b.2xlarge. We have set up WAL archiving and restoring them onto a replica server. The WAL restore on the replica is very slow and we are not able to achieve the 4 hour delayed replica. It is always behind 30 hrs with the huge WAL generation.
I have checked the following and they look fine
1. Bottlenecks on the replica server
2. Memory consumption and swap
3. EFS IO throughput
4.checkpoint_completion_target = 0.9
5. wal_buffers = 16MB6. wal_log_hints = on
7. Verified logs and didn't find anything useful related to the issue
Can you please suggest how to improve the WAL restore performance
Thank you
Madhu Sudan
First off, why are you still running on a rather small r5 instance? AWS has had r6 instances available for some time, andthey’re faster and more efficient. Of my some 30 instances which I take care of, I haven’t had any r5’s in quite sometime. And it sounds here that you’re running on an EC2 instance as well. Is there some reason you haven’t gone to RDS? Clusterconfiguration in RDS uses an internal RDS specific replication method that does not involve WAL files. — John Sent from my iPad > On Aug 29, 2022, at 6:10 AM, Madhu Sudan <madhusudan0429@gmail.com> wrote: > > > Hi > > We have PG-14 with a huge data set of 14 TB running on r5b.2xlarge. We have set up WAL archiving and restoring them ontoa replica server. The WAL restore on the replica is very slow and we are not able to achieve the 4 hour delayed replica.It is always behind 30 hrs with the huge WAL generation. > > I have checked the following and they look fine > 1. Bottlenecks on the replica server > 2. Memory consumption and swap > 3. EFS IO throughput > 4.checkpoint_completion_target = 0.9 > 5. wal_buffers = 16MB > 6. wal_log_hints = on > 7. Verified logs and didn't find anything useful related to the issue > > Can you please suggest how to improve the WAL restore performance > > Thank you > Madhu Sudan > > >
On Mon, Aug 29, 2022 at 6:10 AM Madhu Sudan <madhusudan0429@gmail.com> wrote:
HiWe have PG-14 with a huge data set of 14 TB running on r5b.2xlarge. We have set up WAL archiving and restoring them onto a replica server. The WAL restore on the replica is very slow and we are not able to achieve the 4 hour delayed replica. It is always behind 30 hrs with the huge WAL generation.I have checked the following and they look fine1. Bottlenecks on the replica server
How did you check for bottlenecks, and what did you see to conclude it looked fine? Clearly there is a bottleneck somewhere.
Cheers,
Jeff