Обсуждение: RDS No free space

Поиск

Список

Период

Сортировка

RDS No free space

От

Wells Oliver

Дата:

21 мая 2023 г., 20:15:12

I am seeing a lot of this morning:

ERROR: could not extend file "base/16411/616989679.62": No space left on devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated; reinitializing

2023-05-21 17:04:02 UTC:172.31.21.22(55060):woliver@db:[31430]:ERROR: could not extend file "base/16411/616989585.45": No space left on device

Etc

However, when I can connect, the DB is well below the provisioned space.

I am trying to figure this out but not sure where to turn yet.

Wells Oliver
wells.oliver@gmail.com

Re: RDS No free space

От

Tom Lane

Дата:

21 мая 2023 г., 20:23:08

Wells Oliver <wells.oliver@gmail.com> writes:
> I am seeing a lot of this morning:
> ERROR: could not extend file "base/16411/616989679.62": No space left on
> devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated;
> reinitializing

*Something* is preventing PG from using more disk space.  Check ulimit,
cgroups, container limits, etc.

            regards, tom lane

Re: RDS No free space

От

Wells Oliver

Дата:

21 мая 2023 г., 20:37:40

So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

I know this is part AWS RDS and part PG but it's difficult to track down. Welcome to any ideas to dig into.

On Sun, May 21, 2023 at 10:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Wells Oliver <wells.oliver@gmail.com> writes:
> I am seeing a lot of this morning:
> ERROR: could not extend file "base/16411/616989679.62": No space left on
> devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated;
> reinitializing

*Something* is preventing PG from using more disk space. Check ulimit,
cgroups, container limits, etc.

regards, tom lane

Wells Oliver
wells.oliver@gmail.com

Re: RDS No free space

От

Jim Mlodgenski

Дата:

21 мая 2023 г., 20:42:49

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:

So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check

to see if there is an inactive slot that may be preventing the files to be removed.

Re: RDS No free space

От

Wells Oliver

Дата:

21 мая 2023 г., 20:47:54

I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.

Wells Oliver
wells.oliver@gmail.com

Re: RDS No free space

От

Lucio Chiessi

Дата:

21 мая 2023 г., 20:52:13

You can check if you having some strong queries that were generating a lot of temp files, that will can eat all off free space.

On Sun, 21 May 2023 at 14:48 Wells Oliver <wells.oliver@gmail.com> wrote:

I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.

--
Wells Oliver
wells.oliver@gmail.com

Lucio Chiessi

Senior Database Administrator

Trustly, Inc.

M: +55 27 996360276

PayWithMyBank^® is now part of Trustly

Please read our privacy policy here on how we process your personal data in accordance with the General Data Protection Regulation (EU) 2016/679 (the “GDPR”) and other applicable data protection legislation

Re: RDS No free space

От

Saikat Banerjee

Дата:

21 мая 2023 г., 20:58:57

This might help-

[+] https://repost.aws/knowledge-center/diskfull-error-rds-postgresql

On Sun, May 21, 2023 at 1:52 PM Lucio Chiessi <lucio.chiessi@trustly.com> wrote:

You can check if you having some strong queries that were generating a lot of temp files, that will can eat all off free space.

On Sun, 21 May 2023 at 14:48 Wells Oliver <wells.oliver@gmail.com> wrote:
I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.

--
Wells Oliver
wells.oliver@gmail.com
--
Lucio Chiessi
Senior Database Administrator
Trustly, Inc.
M: +55 27 996360276

PayWithMyBank^® is now part of Trustly

Please read our privacy policy here on how we process your personal data in accordance with the General Data Protection Regulation (EU) 2016/679 (the “GDPR”) and other applicable data protection legislation

RE: RDS No free space

От

Dustin Jantz

Дата:

21 мая 2023 г., 20:59:32

To check the replication slots run the following query. The lag is how much the WAL file is holding on to data. If the slot is not active you can remove the slot and that will free up the space the WAL file is holding on to.

SELECT

rps.slot_name,

pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) pretty_replication_slot_lag,

pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) pretty_confirmed_lag,

rps.active slot_active,pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) as replication_slot_

lag,pg_wal_lsn_diff(pg_current_wal_lsn(),

confirmed_flush_lsn) as confirmed_lag,

rps.active_pid

FROM pg_replication_slots rps

Remove the inactive replication slot:

SELECT pg_drop_replication_slot('<slot_name>');

The only other time I had this issue was because of an incomplete vacuum full on a large table. This caused there to be orphaned files which took up a lot of space.

Kind regards,

Dustin Jantz

djantz@frontporch.com

From: Wells Oliver <wells.oliver@gmail.com>
Sent: Sunday, May 21, 2023 10:48 AM
To: Jim Mlodgenski <jimmy76@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-admin <pgsql-admin@postgresql.org>
Subject: Re: RDS No free space

I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.

Wells Oliver
wells.oliver@gmail.com

Re: RDS No free space

От

Wells Oliver

Дата:

21 мая 2023 г., 21:14:43

Dustin, how did you come to see the incomplete vacuum and the orphaned files?

On Sun, May 21, 2023 at 10:59 AM Dustin Jantz <djantz@frontporch.com> wrote:

To check the replication slots run the following query. The lag is how much the WAL file is holding on to data. If the slot is not active you can remove the slot and that will free up the space the WAL file is holding on to.

SELECT
rps.slot_name,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) pretty_replication_slot_lag,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) pretty_confirmed_lag,
rps.active slot_active,pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) as replication_slot_
lag,pg_wal_lsn_diff(pg_current_wal_lsn(),
confirmed_flush_lsn) as confirmed_lag,
rps.active_pid
FROM pg_replication_slots rps

Remove the inactive replication slot:

SELECT pg_drop_replication_slot('<slot_name>');

The only other time I had this issue was because of an incomplete vacuum full on a large table. This caused there to be orphaned files which took up a lot of space.

Kind regards,

Dustin Jantz
djantz@frontporch.com

From: Wells Oliver <wells.oliver@gmail.com>
Sent: Sunday, May 21, 2023 10:48 AM
To: Jim Mlodgenski <jimmy76@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-admin <pgsql-admin@postgresql.org>
Subject: Re: RDS No free space

I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.

--
Wells Oliver
wells.oliver@gmail.com

Wells Oliver
wells.oliver@gmail.com

Re: RDS No free space

От

Wells Oliver

Дата:

21 мая 2023 г., 21:20:30

We've dug a little further and we feel somewhat confident that it's retaining WAL logs forever. Our free storage starts to drop steadily/precipitously when we enabled WAL. Could someone point me to some settings/queries to dig a bit on that?

On Sun, May 21, 2023 at 11:14 AM Wells Oliver <wells.oliver@gmail.com> wrote:

Dustin, how did you come to see the incomplete vacuum and the orphaned files?

On Sun, May 21, 2023 at 10:59 AM Dustin Jantz <djantz@frontporch.com> wrote:
To check the replication slots run the following query. The lag is how much the WAL file is holding on to data. If the slot is not active you can remove the slot and that will free up the space the WAL file is holding on to.

SELECT
rps.slot_name,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) pretty_replication_slot_lag,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) pretty_confirmed_lag,
rps.active slot_active,pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) as replication_slot_
lag,pg_wal_lsn_diff(pg_current_wal_lsn(),
confirmed_flush_lsn) as confirmed_lag,
rps.active_pid
FROM pg_replication_slots rps

Remove the inactive replication slot:

SELECT pg_drop_replication_slot('<slot_name>');

The only other time I had this issue was because of an incomplete vacuum full on a large table. This caused there to be orphaned files which took up a lot of space.

Kind regards,

Dustin Jantz
djantz@frontporch.com

From: Wells Oliver <wells.oliver@gmail.com>
Sent: Sunday, May 21, 2023 10:48 AM
To: Jim Mlodgenski <jimmy76@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-admin <pgsql-admin@postgresql.org>
Subject: Re: RDS No free space

I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.

--
Wells Oliver
wells.oliver@gmail.com

--
Wells Oliver
wells.oliver@gmail.com

Wells Oliver
wells.oliver@gmail.com

Re: RDS No free space

От

Wells Oliver

Дата:

21 мая 2023 г., 21:42:38

Hi all, we've identified the issue. We had a very laggy replication instance a high volume of data in our replication slots, I'm not sure the underlying PG number here, but the RDS metric is "OldestReplicationSlotLag" and it was by terabytes (as the replication instance had crashed).

I appreciate the info and ideas here. Very grateful for this mailing list.

On Sun, May 21, 2023 at 11:20 AM Wells Oliver <wells.oliver@gmail.com> wrote:

We've dug a little further and we feel somewhat confident that it's retaining WAL logs forever. Our free storage starts to drop steadily/precipitously when we enabled WAL. Could someone point me to some settings/queries to dig a bit on that?

On Sun, May 21, 2023 at 11:14 AM Wells Oliver <wells.oliver@gmail.com> wrote:
Dustin, how did you come to see the incomplete vacuum and the orphaned files?

On Sun, May 21, 2023 at 10:59 AM Dustin Jantz <djantz@frontporch.com> wrote:
To check the replication slots run the following query. The lag is how much the WAL file is holding on to data. If the slot is not active you can remove the slot and that will free up the space the WAL file is holding on to.

SELECT
rps.slot_name,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) pretty_replication_slot_lag,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) pretty_confirmed_lag,
rps.active slot_active,pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) as replication_slot_
lag,pg_wal_lsn_diff(pg_current_wal_lsn(),
confirmed_flush_lsn) as confirmed_lag,
rps.active_pid
FROM pg_replication_slots rps

Remove the inactive replication slot:

SELECT pg_drop_replication_slot('<slot_name>');

The only other time I had this issue was because of an incomplete vacuum full on a large table. This caused there to be orphaned files which took up a lot of space.

Kind regards,

Dustin Jantz
djantz@frontporch.com

From: Wells Oliver <wells.oliver@gmail.com>
Sent: Sunday, May 21, 2023 10:48 AM
To: Jim Mlodgenski <jimmy76@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-admin <pgsql-admin@postgresql.org>
Subject: Re: RDS No free space

I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.

--
Wells Oliver
wells.oliver@gmail.com

--
Wells Oliver
wells.oliver@gmail.com

--
Wells Oliver
wells.oliver@gmail.com

Wells Oliver
wells.oliver@gmail.com

Re: RDS No free space

От

Ron

Дата:

21 мая 2023 г., 22:57:43

On 5/21/23 12:52, Lucio Chiessi wrote:

You can check if you having some strong queries that were generating a lot of temp files, that will can eat all off free space.

That was my first thought.

On Sun, 21 May 2023 at 14:48 Wells Oliver <wells.oliver@gmail.com> wrote:
I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.

--
Wells Oliver
wells.oliver@gmail.com
--
Lucio Chiessi
Senior Database Administrator
Trustly, Inc.
M: +55 27 996360276

PayWithMyBank^® is now part of Trustly

Please read our privacy policy here on how we process your personal data in accordance with the General Data Protection Regulation (EU) 2016/679 (the “GDPR”) and other applicable data protection legislation

--
Born in Arizona, moved to Babylonia.

Re: RDS No free space

От

Wilson Coelho

Дата:

22 мая 2023 г., 15:49:11

Hi Oliver

You need to check where are the postgres data directory and verify if there is enough available space on it to the operation you trying to do over the database.
It seems to be a problem related with a partition scheme and not the DB per se.
Also take a look at the distribution of your data over the partitions. If that file is on a partition with no space, try to resize or, move to other partition

Regards

Wilson Coelho

---

Wilson Moraes Coelho
Especialista

Logo
Tecnisys

Sia Trecho 08, lotes 245 / 255 / 265 ||

Tel.:+55 (61) 3039-9700 - (61) 99989-8932

71205-080 || Guará || Brasília, DF 0800-6020097

www.tecnisys.com.br

Em 21/05/2023 14:15, Wells Oliver escreveu:

I am seeing a lot of this morning:

ERROR: could not extend file "base/16411/616989679.62": No space left on devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated; reinitializing

2023-05-21 17:04:02 UTC:172.31.21.22(55060):woliver@db:[31430]:ERROR: could not extend file "base/16411/616989585.45": No space left on device

Etc

However, when I can connect, the DB is well below the provisioned space.

I am trying to figure this out but not sure where to turn yet.

--
Wells Oliver
wells.oliver@gmail.com

Вложения

Re: RDS No free space

От

Priancka Chatz

Дата:

22 мая 2023 г., 15:54:10

Hi Oliver,

I think for RDS Postgres space issue this is good guidance: https://repost.aws/knowledge-center/diskfull-error-rds-postgresql .

Regards,

Priyanka

On Mon, May 22, 2023 at 2:49 PM Wilson Coelho <wilson.coelho@tecnisys.com.br> wrote:

Hi Oliver
You need to check where are the postgres data directory and verify if there is enough available space on it to the operation you trying to do over the database.
It seems to be a problem related with a partition scheme and not the DB per se.
Also take a look at the distribution of your data over the partitions. If that file is on a partition with no space, try to resize or, move to other partition
Regards
Wilson Coelho
---
Wilson Moraes Coelho
Especialista

Sia Trecho 08, lotes 245 / 255 / 265 ||
Tel.:+55 (61) 3039-9700 - (61) 99989-8932
71205-080 || Guará || Brasília, DF 0800-6020097

www.tecnisys.com.br

Em 21/05/2023 14:15, Wells Oliver escreveu:
I am seeing a lot of this morning:

ERROR: could not extend file "base/16411/616989679.62": No space left on devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated; reinitializing

2023-05-21 17:04:02 UTC:172.31.21.22(55060):woliver@db:[31430]:ERROR: could not extend file "base/16411/616989585.45": No space left on device

Etc

However, when I can connect, the DB is well below the provisioned space.

I am trying to figure this out but not sure where to turn yet.

--
Wells Oliver
wells.oliver@gmail.com

Вложения

Re: RDS No free space

От

Jeff Janes

Дата:

22 мая 2023 г., 22:07:11

On Sun, May 21, 2023 at 1:15 PM Wells Oliver <wells.oliver@gmail.com> wrote:

I am seeing a lot of this morning:

ERROR: could not extend file "base/16411/616989679.62": No space left on devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated; reinitializing

2023-05-21 17:04:02 UTC:172.31.21.22(55060):woliver@db:[31430]:ERROR: could not extend file "base/16411/616989585.45": No space left on device

The 'all server processes terminated' message indicates that there has been a PANIC, not merely an ERROR. Was there also an earlier PANIC message?

Etc

However, when I can connect, the DB is well below the provisioned space.

Do objects with those relfilenodes exist? What statements were executing when the errors were thrown? (the statement text should be included in the error message)

Cheers,

Jeff

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: RDS No free space

Вложения

Вложения