Обсуждение: streaming vs wal shipping issue

Поиск
Список
Период
Сортировка

streaming vs wal shipping issue

От
Tomáš Uko
Дата:

Hi all,

 

We have recently found weird issue with WAL being still present after recovery (In fact, they are never deleted), this problem is on several version (starting from 9.4 up to 9.6, possibly later).

Our setup is this:

-          Master in one DC, with wal shipping to different storage/server

o   Wal shipping via archive_command (pigz and scp  - and we check return codes J )

-          Several replicas in different DCs, configured use streaming replication as well as wal restore from that storage with archive_mode = 'on'

o   snippet of recovery.conf:

standby_mode = 'on'

primary_conninfo = 'host=XY port=5432 user=AAA password=PASSWORD sslmode=verify-full'

restore_command = '~/bin/wal_handler.sh -r -h REMOTE_STORAGE -f %f -p %p -d LOCATION_ON_REMOTE_STORAGE'

 

This custom script does this:

  [ -f “PARAM_P/PARAM_F " ] && { echo "Logfile PARAM_P/PARAM_F already exists"; }

  echo "Restoring xlog PARAM_F "

  scp -i $SshKey -o PasswordAuthentication=no -o StrictHostKeyChecking=no REMOTE_STORAGE:LOCATION_ON_REMOTE_STORAGE/PARAM_F.gz PARAM_P

  [ $? == 0 ] || { echo "Failed to retrieve log PARAM_F from REMOTE_STORAGE" && exit 1; }

  unpigz -f PARAM_P/PARAM_F.gz

  [ $? == 0 ] || { echo "Failed to unpack log to PARAM_P/PARAM_F.gz" && exit 1; }

 

When our master became under some massive DML workload (daily bases) some of replicas stops using streaming replication and switch to wal restore (so far good). Message in logs:

 

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; LOG:  recovery restart point at CEA9/3007188

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; DETAIL:  last completed transaction was at log time 2019-02-25 22:27:14.059533+01

2019-02-25 23:10:36.829 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint starting: xlog

2019-02-25 23:10:47.832 CET, pid: 2100974, client: , db: , usr: ; FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000020000CEA90000008A has already been removed

 

But once they start restoring wals, they never deletes afterwards. We’ve discovered when there is restore point, it creates next wal segment in sequence (in pg_xlog dir), which is immediately overwritten by restore.

With overwrite enabled logs are full of:

Restoring xlog 000000020000CE3900000029

Restoring xlog 000000020000CE390000002A

Restoring xlog 000000020000CE390000002B

Restoring xlog 000000020000CE390000002C

Restoring xlog 000000020000CE390000002D

Restoring xlog 000000020000CE390000002E

And once if catches up:

scp: /LOCATION_ON_REMOTE_STORAGE/000000020000CE8B00000017.gz: No such file or directory

Failed to retrieve log 000000020000CE8B00000017 from REMOTE_STORAGE

2019-02-25 09:09:28.490 CET, pid: 2090891, client: , db: , usr: ; LOG:  started streaming WAL from primary at CE8B/17000000 on timeline 2

 

When we disabled overriding of WALs during restore we git this error message

2019-02-25 23:10:03.086 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint complete: wrote 142421 buffers (18.1%); 0 transaction log file(s) added, 68 removed, 0 recycled; write=24.086 s, sync=0.022 s, total=24.424 s; sync files=74, longest=0.005 s, average=0.000 s; distance=1114130 kB, estimate=1114148 kB

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; LOG:  recovery restart point at CEA9/3007188

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; DETAIL:  last completed transaction was at log time 2019-02-25 22:27:14.059533+01

2019-02-25 23:10:36.829 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint starting: xlog

Logfile pg_xlog/000000020000CEA90000008A already exists

Restoring xlog 000000020000CEA90000008A

unpigz: abort: write error on pg_xlog/000000020000CEA90000008A (Inappropriate ioctl for device)

Which means, wal file is present in this xlog directory

 

Thanks for any advice

 

 

 

 

Re: streaming vs wal shipping issue

От
Shreeyansh Dba
Дата:
Hi Tomáš Uko,

Here you go in detail to understand the steaming vs wal shipping :
"Streaming replication" refers to continuous sending of WAL records over a TCP/IP connection between the master and the replica, using the walsender protocol over replication connections. The master reads its own WAL from pg_xlog and sends it to the replica on demand. It's configured with a primary_conninfo directive in recovery.conf and pg_hba.conf entries on the master to permit replicationconnections. You also need wal_keep_segments and some other options.

"Log shipping" refers to periodic sending of WAL records as whole WAL archives via a file transfer protocol to an archive location from which the replica can then fetch them. It's configured with a restore_command directive in recovery.conf and an archive_command in the master. PostgreSQL doesn't care where the files are or how they're transferred, only that the archive_command puts them there and the restore_command fetches the required archive.

Streaming replication doesn't have as much lag, as records are sent as they are generated. However, it requires both master and replica to be online and able to communicate directly. It also requires the replica to keep up well enough that the master still has on-disk copies of the WAL the replica needs, and generally requires you to spend extra pg_xlog space on retaining extra WAL for the replica.

Thanks & Regards,
Shreeyansh DBA Team
www.shreeyansh.com


On Tue, Feb 26, 2019 at 2:56 PM Tomáš Uko <uko@avast.com> wrote:

Hi all,

 

We have recently found weird issue with WAL being still present after recovery (In fact, they are never deleted), this problem is on several version (starting from 9.4 up to 9.6, possibly later).

Our setup is this:

-          Master in one DC, with wal shipping to different storage/server

o   Wal shipping via archive_command (pigz and scp  - and we check return codes J )

-          Several replicas in different DCs, configured use streaming replication as well as wal restore from that storage with archive_mode = 'on'

o   snippet of recovery.conf:

standby_mode = 'on'

primary_conninfo = 'host=XY port=5432 user=AAA password=PASSWORD sslmode=verify-full'

restore_command = '~/bin/wal_handler.sh -r -h REMOTE_STORAGE -f %f -p %p -d LOCATION_ON_REMOTE_STORAGE'

 

This custom script does this:

  [ -f “PARAM_P/PARAM_F " ] && { echo "Logfile PARAM_P/PARAM_F already exists"; }

  echo "Restoring xlog PARAM_F "

  scp -i $SshKey -o PasswordAuthentication=no -o StrictHostKeyChecking=no REMOTE_STORAGE:LOCATION_ON_REMOTE_STORAGE/PARAM_F.gz PARAM_P

  [ $? == 0 ] || { echo "Failed to retrieve log PARAM_F from REMOTE_STORAGE" && exit 1; }

  unpigz -f PARAM_P/PARAM_F.gz

  [ $? == 0 ] || { echo "Failed to unpack log to PARAM_P/PARAM_F.gz" && exit 1; }

 

When our master became under some massive DML workload (daily bases) some of replicas stops using streaming replication and switch to wal restore (so far good). Message in logs:

 

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; LOG:  recovery restart point at CEA9/3007188

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; DETAIL:  last completed transaction was at log time 2019-02-25 22:27:14.059533+01

2019-02-25 23:10:36.829 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint starting: xlog

2019-02-25 23:10:47.832 CET, pid: 2100974, client: , db: , usr: ; FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000020000CEA90000008A has already been removed

 

But once they start restoring wals, they never deletes afterwards. We’ve discovered when there is restore point, it creates next wal segment in sequence (in pg_xlog dir), which is immediately overwritten by restore.

With overwrite enabled logs are full of:

Restoring xlog 000000020000CE3900000029

Restoring xlog 000000020000CE390000002A

Restoring xlog 000000020000CE390000002B

Restoring xlog 000000020000CE390000002C

Restoring xlog 000000020000CE390000002D

Restoring xlog 000000020000CE390000002E

And once if catches up:

scp: /LOCATION_ON_REMOTE_STORAGE/000000020000CE8B00000017.gz: No such file or directory

Failed to retrieve log 000000020000CE8B00000017 from REMOTE_STORAGE

2019-02-25 09:09:28.490 CET, pid: 2090891, client: , db: , usr: ; LOG:  started streaming WAL from primary at CE8B/17000000 on timeline 2

 

When we disabled overriding of WALs during restore we git this error message

2019-02-25 23:10:03.086 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint complete: wrote 142421 buffers (18.1%); 0 transaction log file(s) added, 68 removed, 0 recycled; write=24.086 s, sync=0.022 s, total=24.424 s; sync files=74, longest=0.005 s, average=0.000 s; distance=1114130 kB, estimate=1114148 kB

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; LOG:  recovery restart point at CEA9/3007188

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; DETAIL:  last completed transaction was at log time 2019-02-25 22:27:14.059533+01

2019-02-25 23:10:36.829 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint starting: xlog

Logfile pg_xlog/000000020000CEA90000008A already exists

Restoring xlog 000000020000CEA90000008A

unpigz: abort: write error on pg_xlog/000000020000CEA90000008A (Inappropriate ioctl for device)

Which means, wal file is present in this xlog directory

 

Thanks for any advice

 

 

 

 

RE: streaming vs wal shipping issue

От
Tomáš Uko
Дата:

Hi,

 

Just to clarify, we use both methods, everything is correctly set and when there is no issue working …  i.e. see this line from log:

2019-02-25 09:09:28.490 CET, pid: 2090891, client: , db: , usr: ; LOG:  started streaming WAL from primary at CE8B/17000000 on timeline 2

 

But problem appears once there is large enough replication lag, so streaming replication just give up and it reverts to replaying WALs (which is point of have both methods, so we don’t need to re-initialize slave/replica again from master), until it catches up with master again (and switches to streaming replication again). Only problem is, that those WAL files copied from “master” (technically from shared storage server) are never deleted from slave/replica pg_xlog directory (after they are replayed on slave/replica).  

 

Best regards

Tomas Uko

 

From: Shreeyansh Dba <shreeyansh2014@gmail.com>
Sent: úterý 26. února 2019 14:01
To: Tomáš Uko <uko@avast.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>; Petr Novák <novakp@avast.com>
Subject: Re: streaming vs wal shipping issue

 

Hi Tomáš Uko,


Here you go in detail to understand the steaming vs wal shipping :
"Streaming replication" refers to continuous sending of WAL records over a TCP/IP connection between the master and the replica, using the walsender protocol over 
replication connections. The master reads its own WAL from pg_xlog and sends it to the replica on demand. It's configured with a primary_conninfo directive in recovery.conf and pg_hba.conf entries on the master to permit replicationconnections. You also need wal_keep_segments and some other options.

"Log shipping" refers to periodic sending of WAL records as whole WAL archives via a file transfer protocol to an archive location from which the replica can then fetch them. It's configured with a restore_command directive in recovery.conf and an archive_command in the master. PostgreSQL doesn't care where the files are or how they're transferred, only that the archive_command puts them there and the restore_command fetches the required archive.

Streaming replication doesn't have as much lag, as records are sent as they are generated. However, it requires both master and replica to be online and able to communicate directly. It also requires the replica to keep up well enough that the master still has on-disk copies of the WAL the replica needs, and generally requires you to spend extra pg_xlog space on retaining extra WAL for the replica.

 

Thanks & Regards,
Shreeyansh DBA Team
www.shreeyansh.com

 

 

On Tue, Feb 26, 2019 at 2:56 PM Tomáš Uko <uko@avast.com> wrote:

Hi all,

 

We have recently found weird issue with WAL being still present after recovery (In fact, they are never deleted), this problem is on several version (starting from 9.4 up to 9.6, possibly later).

Our setup is this:

-          Master in one DC, with wal shipping to different storage/server

o   Wal shipping via archive_command (pigz and scp  - and we check return codes J )

-          Several replicas in different DCs, configured use streaming replication as well as wal restore from that storage with archive_mode = 'on'

o   snippet of recovery.conf:

standby_mode = 'on'

primary_conninfo = 'host=XY port=5432 user=AAA password=PASSWORD sslmode=verify-full'

restore_command = '~/bin/wal_handler.sh -r -h REMOTE_STORAGE -f %f -p %p -d LOCATION_ON_REMOTE_STORAGE'

 

This custom script does this:

  [ -f “PARAM_P/PARAM_F " ] && { echo "Logfile PARAM_P/PARAM_F already exists"; }

  echo "Restoring xlog PARAM_F "

  scp -i $SshKey -o PasswordAuthentication=no -o StrictHostKeyChecking=no REMOTE_STORAGE:LOCATION_ON_REMOTE_STORAGE/PARAM_F.gz PARAM_P

  [ $? == 0 ] || { echo "Failed to retrieve log PARAM_F from REMOTE_STORAGE" && exit 1; }

  unpigz -f PARAM_P/PARAM_F.gz

  [ $? == 0 ] || { echo "Failed to unpack log to PARAM_P/PARAM_F.gz" && exit 1; }

 

When our master became under some massive DML workload (daily bases) some of replicas stops using streaming replication and switch to wal restore (so far good). Message in logs:

 

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; LOG:  recovery restart point at CEA9/3007188

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; DETAIL:  last completed transaction was at log time 2019-02-25 22:27:14.059533+01

2019-02-25 23:10:36.829 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint starting: xlog

2019-02-25 23:10:47.832 CET, pid: 2100974, client: , db: , usr: ; FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000020000CEA90000008A has already been removed

 

But once they start restoring wals, they never deletes afterwards. We’ve discovered when there is restore point, it creates next wal segment in sequence (in pg_xlog dir), which is immediately overwritten by restore.

With overwrite enabled logs are full of:

Restoring xlog 000000020000CE3900000029

Restoring xlog 000000020000CE390000002A

Restoring xlog 000000020000CE390000002B

Restoring xlog 000000020000CE390000002C

Restoring xlog 000000020000CE390000002D

Restoring xlog 000000020000CE390000002E

And once if catches up:

scp: /LOCATION_ON_REMOTE_STORAGE/000000020000CE8B00000017.gz: No such file or directory

Failed to retrieve log 000000020000CE8B00000017 from REMOTE_STORAGE

2019-02-25 09:09:28.490 CET, pid: 2090891, client: , db: , usr: ; LOG:  started streaming WAL from primary at CE8B/17000000 on timeline 2

 

When we disabled overriding of WALs during restore we git this error message

2019-02-25 23:10:03.086 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint complete: wrote 142421 buffers (18.1%); 0 transaction log file(s) added, 68 removed, 0 recycled; write=24.086 s, sync=0.022 s, total=24.424 s; sync files=74, longest=0.005 s, average=0.000 s; distance=1114130 kB, estimate=1114148 kB

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; LOG:  recovery restart point at CEA9/3007188

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; DETAIL:  last completed transaction was at log time 2019-02-25 22:27:14.059533+01

2019-02-25 23:10:36.829 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint starting: xlog

Logfile pg_xlog/000000020000CEA90000008A already exists

Restoring xlog 000000020000CEA90000008A

unpigz: abort: write error on pg_xlog/000000020000CEA90000008A (Inappropriate ioctl for device)

Which means, wal file is present in this xlog directory

 

Thanks for any advice

 

 

 

 

Re: streaming vs wal shipping issue

От
Shreeyansh Dba
Дата:
PostgreSQL doesn't clean up files copied by your archive_command. You need to have a separate task clean those out. PostgreSQL's active wal_keep_segments etc. are in the data/pg_xlog directory.

Thanks & Regards,
Shreeyansh DBA Team
www.shreeyansh.com


On Tue, Feb 26, 2019 at 8:31 PM Tomáš Uko <uko@avast.com> wrote:

Hi,

 

Just to clarify, we use both methods, everything is correctly set and when there is no issue working …  i.e. see this line from log:

2019-02-25 09:09:28.490 CET, pid: 2090891, client: , db: , usr: ; LOG:  started streaming WAL from primary at CE8B/17000000 on timeline 2

 

But problem appears once there is large enough replication lag, so streaming replication just give up and it reverts to replaying WALs (which is point of have both methods, so we don’t need to re-initialize slave/replica again from master), until it catches up with master again (and switches to streaming replication again). Only problem is, that those WAL files copied from “master” (technically from shared storage server) are never deleted from slave/replica pg_xlog directory (after they are replayed on slave/replica).  

 

Best regards

Tomas Uko

 

From: Shreeyansh Dba <shreeyansh2014@gmail.com>
Sent: úterý 26. února 2019 14:01
To: Tomáš Uko <uko@avast.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>; Petr Novák <novakp@avast.com>
Subject: Re: streaming vs wal shipping issue

 

Hi Tomáš Uko,


Here you go in detail to understand the steaming vs wal shipping :
"Streaming replication" refers to continuous sending of WAL records over a TCP/IP connection between the master and the replica, using the walsender protocol over 
replication connections. The master reads its own WAL from pg_xlog and sends it to the replica on demand. It's configured with a primary_conninfo directive in recovery.conf and pg_hba.conf entries on the master to permit replicationconnections. You also need wal_keep_segments and some other options.

"Log shipping" refers to periodic sending of WAL records as whole WAL archives via a file transfer protocol to an archive location from which the replica can then fetch them. It's configured with a restore_command directive in recovery.conf and an archive_command in the master. PostgreSQL doesn't care where the files are or how they're transferred, only that the archive_command puts them there and the restore_command fetches the required archive.

Streaming replication doesn't have as much lag, as records are sent as they are generated. However, it requires both master and replica to be online and able to communicate directly. It also requires the replica to keep up well enough that the master still has on-disk copies of the WAL the replica needs, and generally requires you to spend extra pg_xlog space on retaining extra WAL for the replica.

 

Thanks & Regards,
Shreeyansh DBA Team
www.shreeyansh.com

 

 

On Tue, Feb 26, 2019 at 2:56 PM Tomáš Uko <uko@avast.com> wrote:

Hi all,

 

We have recently found weird issue with WAL being still present after recovery (In fact, they are never deleted), this problem is on several version (starting from 9.4 up to 9.6, possibly later).

Our setup is this:

-          Master in one DC, with wal shipping to different storage/server

o   Wal shipping via archive_command (pigz and scp  - and we check return codes J )

-          Several replicas in different DCs, configured use streaming replication as well as wal restore from that storage with archive_mode = 'on'

o   snippet of recovery.conf:

standby_mode = 'on'

primary_conninfo = 'host=XY port=5432 user=AAA password=PASSWORD sslmode=verify-full'

restore_command = '~/bin/wal_handler.sh -r -h REMOTE_STORAGE -f %f -p %p -d LOCATION_ON_REMOTE_STORAGE'

 

This custom script does this:

  [ -f “PARAM_P/PARAM_F " ] && { echo "Logfile PARAM_P/PARAM_F already exists"; }

  echo "Restoring xlog PARAM_F "

  scp -i $SshKey -o PasswordAuthentication=no -o StrictHostKeyChecking=no REMOTE_STORAGE:LOCATION_ON_REMOTE_STORAGE/PARAM_F.gz PARAM_P

  [ $? == 0 ] || { echo "Failed to retrieve log PARAM_F from REMOTE_STORAGE" && exit 1; }

  unpigz -f PARAM_P/PARAM_F.gz

  [ $? == 0 ] || { echo "Failed to unpack log to PARAM_P/PARAM_F.gz" && exit 1; }

 

When our master became under some massive DML workload (daily bases) some of replicas stops using streaming replication and switch to wal restore (so far good). Message in logs:

 

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; LOG:  recovery restart point at CEA9/3007188

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; DETAIL:  last completed transaction was at log time 2019-02-25 22:27:14.059533+01

2019-02-25 23:10:36.829 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint starting: xlog

2019-02-25 23:10:47.832 CET, pid: 2100974, client: , db: , usr: ; FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000020000CEA90000008A has already been removed

 

But once they start restoring wals, they never deletes afterwards. We’ve discovered when there is restore point, it creates next wal segment in sequence (in pg_xlog dir), which is immediately overwritten by restore.

With overwrite enabled logs are full of:

Restoring xlog 000000020000CE3900000029

Restoring xlog 000000020000CE390000002A

Restoring xlog 000000020000CE390000002B

Restoring xlog 000000020000CE390000002C

Restoring xlog 000000020000CE390000002D

Restoring xlog 000000020000CE390000002E

And once if catches up:

scp: /LOCATION_ON_REMOTE_STORAGE/000000020000CE8B00000017.gz: No such file or directory

Failed to retrieve log 000000020000CE8B00000017 from REMOTE_STORAGE

2019-02-25 09:09:28.490 CET, pid: 2090891, client: , db: , usr: ; LOG:  started streaming WAL from primary at CE8B/17000000 on timeline 2

 

When we disabled overriding of WALs during restore we git this error message

2019-02-25 23:10:03.086 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint complete: wrote 142421 buffers (18.1%); 0 transaction log file(s) added, 68 removed, 0 recycled; write=24.086 s, sync=0.022 s, total=24.424 s; sync files=74, longest=0.005 s, average=0.000 s; distance=1114130 kB, estimate=1114148 kB

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; LOG:  recovery restart point at CEA9/3007188

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; DETAIL:  last completed transaction was at log time 2019-02-25 22:27:14.059533+01

2019-02-25 23:10:36.829 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint starting: xlog

Logfile pg_xlog/000000020000CEA90000008A already exists

Restoring xlog 000000020000CEA90000008A

unpigz: abort: write error on pg_xlog/000000020000CEA90000008A (Inappropriate ioctl for device)

Which means, wal file is present in this xlog directory

 

Thanks for any advice

 

 

 

 

RE: streaming vs wal shipping issue

От
Tomáš Uko
Дата:

Hi agan,

 

Point is, that files, which are not deleted after recovery, are WAL files present in DATA_DIR/pg_xlog directory …

 

Regards

Tomas Uko

 

 

From: Shreeyansh Dba <shreeyansh2014@gmail.com>
Sent: úterý 26. února 2019 17:56
To: Tomáš Uko <uko@avast.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>; Petr Novák <novakp@avast.com>
Subject: Re: streaming vs wal shipping issue

 

PostgreSQL doesn't clean up files copied by your archive_command. You need to have a separate task clean those out. PostgreSQL's active wal_keep_segments etc. are in the data/pg_xlog directory.

 

Thanks & Regards,
Shreeyansh DBA Team
www.shreeyansh.com

 

 

On Tue, Feb 26, 2019 at 8:31 PM Tomáš Uko <uko@avast.com> wrote:

Hi,

 

Just to clarify, we use both methods, everything is correctly set and when there is no issue working …  i.e. see this line from log:

2019-02-25 09:09:28.490 CET, pid: 2090891, client: , db: , usr: ; LOG:  started streaming WAL from primary at CE8B/17000000 on timeline 2

 

But problem appears once there is large enough replication lag, so streaming replication just give up and it reverts to replaying WALs (which is point of have both methods, so we don’t need to re-initialize slave/replica again from master), until it catches up with master again (and switches to streaming replication again). Only problem is, that those WAL files copied from “master” (technically from shared storage server) are never deleted from slave/replica pg_xlog directory (after they are replayed on slave/replica).  

 

Best regards

Tomas Uko

 

From: Shreeyansh Dba <shreeyansh2014@gmail.com>
Sent: úterý 26. února 2019 14:01
To: Tomáš Uko <uko@avast.com>
Cc: pgsql-admin <pgsql-admin@postgresql.org>; Petr Novák <novakp@avast.com>
Subject: Re: streaming vs wal shipping issue

 

Hi Tomáš Uko,

Here you go in detail to understand the steaming vs wal shipping :
"Streaming replication" refers to continuous sending of WAL records over a TCP/IP connection between the master and the replica, using the walsender protocol over 
replication connections. The master reads its own WAL from pg_xlog and sends it to the replica on demand. It's configured with a primary_conninfo directive in recovery.conf and pg_hba.conf entries on the master to permit replicationconnections. You also need wal_keep_segments and some other options.

"Log shipping" refers to periodic sending of WAL records as whole WAL archives via a file transfer protocol to an archive location from which the replica can then fetch them. It's configured with a restore_command directive in recovery.conf and an archive_command in the master. PostgreSQL doesn't care where the files are or how they're transferred, only that the archive_command puts them there and the restore_command fetches the required archive.

Streaming replication doesn't have as much lag, as records are sent as they are generated. However, it requires both master and replica to be online and able to communicate directly. It also requires the replica to keep up well enough that the master still has on-disk copies of the WAL the replica needs, and generally requires you to spend extra pg_xlog space on retaining extra WAL for the replica.

 

Thanks & Regards,
Shreeyansh DBA Team
www.shreeyansh.com

 

 

On Tue, Feb 26, 2019 at 2:56 PM Tomáš Uko <uko@avast.com> wrote:

Hi all,

 

We have recently found weird issue with WAL being still present after recovery (In fact, they are never deleted), this problem is on several version (starting from 9.4 up to 9.6, possibly later).

Our setup is this:

-          Master in one DC, with wal shipping to different storage/server

o   Wal shipping via archive_command (pigz and scp  - and we check return codes J )

-          Several replicas in different DCs, configured use streaming replication as well as wal restore from that storage with archive_mode = 'on'

o   snippet of recovery.conf:

standby_mode = 'on'

primary_conninfo = 'host=XY port=5432 user=AAA password=PASSWORD sslmode=verify-full'

restore_command = '~/bin/wal_handler.sh -r -h REMOTE_STORAGE -f %f -p %p -d LOCATION_ON_REMOTE_STORAGE'

 

This custom script does this:

  [ -f “PARAM_P/PARAM_F " ] && { echo "Logfile PARAM_P/PARAM_F already exists"; }

  echo "Restoring xlog PARAM_F "

  scp -i $SshKey -o PasswordAuthentication=no -o StrictHostKeyChecking=no REMOTE_STORAGE:LOCATION_ON_REMOTE_STORAGE/PARAM_F.gz PARAM_P

  [ $? == 0 ] || { echo "Failed to retrieve log PARAM_F from REMOTE_STORAGE" && exit 1; }

  unpigz -f PARAM_P/PARAM_F.gz

  [ $? == 0 ] || { echo "Failed to unpack log to PARAM_P/PARAM_F.gz" && exit 1; }

 

When our master became under some massive DML workload (daily bases) some of replicas stops using streaming replication and switch to wal restore (so far good). Message in logs:

 

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; LOG:  recovery restart point at CEA9/3007188

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; DETAIL:  last completed transaction was at log time 2019-02-25 22:27:14.059533+01

2019-02-25 23:10:36.829 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint starting: xlog

2019-02-25 23:10:47.832 CET, pid: 2100974, client: , db: , usr: ; FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment 000000020000CEA90000008A has already been removed

 

But once they start restoring wals, they never deletes afterwards. We’ve discovered when there is restore point, it creates next wal segment in sequence (in pg_xlog dir), which is immediately overwritten by restore.

With overwrite enabled logs are full of:

Restoring xlog 000000020000CE3900000029

Restoring xlog 000000020000CE390000002A

Restoring xlog 000000020000CE390000002B

Restoring xlog 000000020000CE390000002C

Restoring xlog 000000020000CE390000002D

Restoring xlog 000000020000CE390000002E

And once if catches up:

scp: /LOCATION_ON_REMOTE_STORAGE/000000020000CE8B00000017.gz: No such file or directory

Failed to retrieve log 000000020000CE8B00000017 from REMOTE_STORAGE

2019-02-25 09:09:28.490 CET, pid: 2090891, client: , db: , usr: ; LOG:  started streaming WAL from primary at CE8B/17000000 on timeline 2

 

When we disabled overriding of WALs during restore we git this error message

2019-02-25 23:10:03.086 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint complete: wrote 142421 buffers (18.1%); 0 transaction log file(s) added, 68 removed, 0 recycled; write=24.086 s, sync=0.022 s, total=24.424 s; sync files=74, longest=0.005 s, average=0.000 s; distance=1114130 kB, estimate=1114148 kB

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; LOG:  recovery restart point at CEA9/3007188

2019-02-25 23:10:03.087 CET, pid: 2100971, client: , db: , usr: ; DETAIL:  last completed transaction was at log time 2019-02-25 22:27:14.059533+01

2019-02-25 23:10:36.829 CET, pid: 2100971, client: , db: , usr: ; LOG:  restartpoint starting: xlog

Logfile pg_xlog/000000020000CEA90000008A already exists

Restoring xlog 000000020000CEA90000008A

unpigz: abort: write error on pg_xlog/000000020000CEA90000008A (Inappropriate ioctl for device)

Which means, wal file is present in this xlog directory

 

Thanks for any advice