Обсуждение: RDS No free space

Поиск
Список
Период
Сортировка

RDS No free space

От
Wells Oliver
Дата:
I am seeing a lot of this morning:

ERROR: could not extend file "base/16411/616989679.62": No space left on devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated; reinitializing

2023-05-21 17:04:02 UTC:172.31.21.22(55060):woliver@db:[31430]:ERROR: could not extend file "base/16411/616989585.45": No space left on device

Etc

However, when I can connect, the DB is well below the provisioned space.

I am trying to figure this out but not sure where to turn yet.

--

Re: RDS No free space

От
Tom Lane
Дата:
Wells Oliver <wells.oliver@gmail.com> writes:
> I am seeing a lot of this morning:
> ERROR: could not extend file "base/16411/616989679.62": No space left on
> devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated;
> reinitializing

*Something* is preventing PG from using more disk space.  Check ulimit,
cgroups, container limits, etc.

            regards, tom lane



Re: RDS No free space

От
Wells Oliver
Дата:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

I know this is part AWS RDS and part PG but it's difficult to track down. Welcome to any ideas to dig into.

On Sun, May 21, 2023 at 10:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Wells Oliver <wells.oliver@gmail.com> writes:
> I am seeing a lot of this morning:
> ERROR: could not extend file "base/16411/616989679.62": No space left on
> devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated;
> reinitializing

*Something* is preventing PG from using more disk space.  Check ulimit,
cgroups, container limits, etc.

                        regards, tom lane


--

Re: RDS No free space

От
Jim Mlodgenski
Дата:


On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.

Re: RDS No free space

От
Wells Oliver
Дата:
I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:


On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.



--

Re: RDS No free space

От
Lucio Chiessi
Дата:
You can check if you having some strong queries that were generating a lot of temp files, that will can eat all off free space.

On Sun, 21 May 2023 at 14:48 Wells Oliver <wells.oliver@gmail.com> wrote:
I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:


On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.



--
--

Lucio Chiessi

Senior Database Administrator

Trustly, Inc.

M: +55 27 996360276

  

    

PayWithMyBank® is now part of Trustly


Please read our privacy policy here on how we process your personal data in accordance with the General Data Protection Regulation (EU) 2016/679 (the “GDPR”) and other applicable data protection legislation

Re: RDS No free space

От
Saikat Banerjee
Дата:
This might help-

On Sun, May 21, 2023 at 1:52 PM Lucio Chiessi <lucio.chiessi@trustly.com> wrote:
You can check if you having some strong queries that were generating a lot of temp files, that will can eat all off free space.

On Sun, 21 May 2023 at 14:48 Wells Oliver <wells.oliver@gmail.com> wrote:
I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:


On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.



--
--

Lucio Chiessi

Senior Database Administrator

Trustly, Inc.

M: +55 27 996360276

  

    

PayWithMyBank® is now part of Trustly


Please read our privacy policy here on how we process your personal data in accordance with the General Data Protection Regulation (EU) 2016/679 (the “GDPR”) and other applicable data protection legislation

RE: RDS No free space

От
Dustin Jantz
Дата:

To check the replication slots run the following query. The lag is how much the WAL file is holding on to data. If the slot is not active you can remove the slot and that will free up the space the WAL file is holding on to.

SELECT

rps.slot_name,

pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) pretty_replication_slot_lag,

pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) pretty_confirmed_lag,

rps.active slot_active,pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) as replication_slot_

lag,pg_wal_lsn_diff(pg_current_wal_lsn(),

confirmed_flush_lsn) as confirmed_lag,

rps.active_pid

FROM pg_replication_slots rps

 

 

Remove the inactive replication slot:

 

                SELECT pg_drop_replication_slot('<slot_name>');

 

 

 

The only other time I had this issue was because of an incomplete vacuum full on a large table. This caused there to be orphaned files which took up a lot of space.

 

 

 

Kind regards,

 

 

Dustin Jantz

djantz@frontporch.com

 

From: Wells Oliver <wells.oliver@gmail.com>
Sent: Sunday, May 21, 2023 10:48 AM
To: Jim Mlodgenski <jimmy76@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-admin <pgsql-admin@postgresql.org>
Subject: Re: RDS No free space

 

I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

 

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

 

 

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:

So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

 

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

 

A fairly common cause of this is orphan replication slots so WAL files are retained. Check

to see if there is an inactive slot that may be preventing the files to be removed.

 


 

--

Re: RDS No free space

От
Wells Oliver
Дата:
Dustin, how did you come to see the incomplete vacuum and the orphaned files?

On Sun, May 21, 2023 at 10:59 AM Dustin Jantz <djantz@frontporch.com> wrote:

To check the replication slots run the following query. The lag is how much the WAL file is holding on to data. If the slot is not active you can remove the slot and that will free up the space the WAL file is holding on to.

SELECT

rps.slot_name,

pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) pretty_replication_slot_lag,

pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) pretty_confirmed_lag,

rps.active slot_active,pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) as replication_slot_

lag,pg_wal_lsn_diff(pg_current_wal_lsn(),

confirmed_flush_lsn) as confirmed_lag,

rps.active_pid

FROM pg_replication_slots rps

 

 

Remove the inactive replication slot:

 

                SELECT pg_drop_replication_slot('<slot_name>');

 

 

 

The only other time I had this issue was because of an incomplete vacuum full on a large table. This caused there to be orphaned files which took up a lot of space.

 

 

 

Kind regards,

 

 

Dustin Jantz

djantz@frontporch.com

 

From: Wells Oliver <wells.oliver@gmail.com>
Sent: Sunday, May 21, 2023 10:48 AM
To: Jim Mlodgenski <jimmy76@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-admin <pgsql-admin@postgresql.org>
Subject: Re: RDS No free space

 

I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

 

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

 

 

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:

So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

 

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

 

A fairly common cause of this is orphan replication slots so WAL files are retained. Check

to see if there is an inactive slot that may be preventing the files to be removed.

 


 

--



--

Re: RDS No free space

От
Wells Oliver
Дата:
We've dug a little further and we feel somewhat confident that it's retaining WAL logs forever. Our free storage starts to drop steadily/precipitously when we enabled WAL. Could someone point me to some settings/queries to dig a bit on that?

On Sun, May 21, 2023 at 11:14 AM Wells Oliver <wells.oliver@gmail.com> wrote:
Dustin, how did you come to see the incomplete vacuum and the orphaned files?

On Sun, May 21, 2023 at 10:59 AM Dustin Jantz <djantz@frontporch.com> wrote:

To check the replication slots run the following query. The lag is how much the WAL file is holding on to data. If the slot is not active you can remove the slot and that will free up the space the WAL file is holding on to.

SELECT

rps.slot_name,

pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) pretty_replication_slot_lag,

pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) pretty_confirmed_lag,

rps.active slot_active,pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) as replication_slot_

lag,pg_wal_lsn_diff(pg_current_wal_lsn(),

confirmed_flush_lsn) as confirmed_lag,

rps.active_pid

FROM pg_replication_slots rps

 

 

Remove the inactive replication slot:

 

                SELECT pg_drop_replication_slot('<slot_name>');

 

 

 

The only other time I had this issue was because of an incomplete vacuum full on a large table. This caused there to be orphaned files which took up a lot of space.

 

 

 

Kind regards,

 

 

Dustin Jantz

djantz@frontporch.com

 

From: Wells Oliver <wells.oliver@gmail.com>
Sent: Sunday, May 21, 2023 10:48 AM
To: Jim Mlodgenski <jimmy76@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-admin <pgsql-admin@postgresql.org>
Subject: Re: RDS No free space

 

I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

 

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

 

 

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:

So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

 

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

 

A fairly common cause of this is orphan replication slots so WAL files are retained. Check

to see if there is an inactive slot that may be preventing the files to be removed.

 


 

--



--


--

Re: RDS No free space

От
Wells Oliver
Дата:
Hi all, we've identified the issue. We had a very laggy replication instance a high volume of data in our replication slots, I'm not sure the underlying PG number here, but the RDS metric is "OldestReplicationSlotLag" and it was by terabytes (as the replication instance had crashed).

I appreciate the info and ideas here. Very grateful for this mailing list.

On Sun, May 21, 2023 at 11:20 AM Wells Oliver <wells.oliver@gmail.com> wrote:
We've dug a little further and we feel somewhat confident that it's retaining WAL logs forever. Our free storage starts to drop steadily/precipitously when we enabled WAL. Could someone point me to some settings/queries to dig a bit on that?

On Sun, May 21, 2023 at 11:14 AM Wells Oliver <wells.oliver@gmail.com> wrote:
Dustin, how did you come to see the incomplete vacuum and the orphaned files?

On Sun, May 21, 2023 at 10:59 AM Dustin Jantz <djantz@frontporch.com> wrote:

To check the replication slots run the following query. The lag is how much the WAL file is holding on to data. If the slot is not active you can remove the slot and that will free up the space the WAL file is holding on to.

SELECT

rps.slot_name,

pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) pretty_replication_slot_lag,

pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) pretty_confirmed_lag,

rps.active slot_active,pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) as replication_slot_

lag,pg_wal_lsn_diff(pg_current_wal_lsn(),

confirmed_flush_lsn) as confirmed_lag,

rps.active_pid

FROM pg_replication_slots rps

 

 

Remove the inactive replication slot:

 

                SELECT pg_drop_replication_slot('<slot_name>');

 

 

 

The only other time I had this issue was because of an incomplete vacuum full on a large table. This caused there to be orphaned files which took up a lot of space.

 

 

 

Kind regards,

 

 

Dustin Jantz

djantz@frontporch.com

 

From: Wells Oliver <wells.oliver@gmail.com>
Sent: Sunday, May 21, 2023 10:48 AM
To: Jim Mlodgenski <jimmy76@gmail.com>
Cc: Tom Lane <tgl@sss.pgh.pa.us>; pgsql-admin <pgsql-admin@postgresql.org>
Subject: Re: RDS No free space

 

I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

 

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:

 

 

On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:

So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

 

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

 

A fairly common cause of this is orphan replication slots so WAL files are retained. Check

to see if there is an inactive slot that may be preventing the files to be removed.

 


 

--



--


--


--

Re: RDS No free space

От
Ron
Дата:
On 5/21/23 12:52, Lucio Chiessi wrote:
You can check if you having some strong queries that were generating a lot of temp files, that will can eat all off free space.

That was my first thought.


On Sun, 21 May 2023 at 14:48 Wells Oliver <wells.oliver@gmail.com> wrote:
I'm not too familiar with that. Can you point me in the direction of some config settings and maybe queries to execute?

On Sun, May 21, 2023 at 10:43 AM Jim Mlodgenski <jimmy76@gmail.com> wrote:


On Sun, May 21, 2023 at 1:38 PM Wells Oliver <wells.oliver@gmail.com> wrote:
So we run on RDS, and we clearly used up all of our provisioned storage. However, I am baffled, and while I am emailing our AWS support, I wondered if this list might point me in some direction too.

Our provisioned storage was 15TB. The size of our database -- shown in pg_database -- is only 6TB. What in the world could be using that remaining space? I am at a loss, that's a _ton_ of space being used up. Is it some temporary allocation during script execution (seems ginormous, impossible)? It it some WAL log thing?

A fairly common cause of this is orphan replication slots so WAL files are retained. Check
to see if there is an inactive slot that may be preventing the files to be removed.



--
--

Lucio Chiessi

Senior Database Administrator

Trustly, Inc.

M: +55 27 996360276

  

    

PayWithMyBank® is now part of Trustly


Please read our privacy policy here on how we process your personal data in accordance with the General Data Protection Regulation (EU) 2016/679 (the “GDPR”) and other applicable data protection legislation

--
Born in Arizona, moved to Babylonia.

Re: RDS No free space

От
Wilson Coelho
Дата:

Hi Oliver

You need to check where are the postgres data directory and verify if there is enough available space on it to the operation you trying to do over the database.
It seems to be a problem related with a partition scheme and not the DB per se.
Also take a look at the distribution of your data over the partitions. If that file is on a partition with no space, try to resize or, move to other partition

Regards

Wilson Coelho

---

Wilson Moraes Coelho
Especialista

Logo
Tecnisys

Sia Trecho 08, lotes 245 / 255 / 265 ||

Tel.:+55 (61) 3039-9700 - (61) 99989-8932

71205-080 || Guará || Brasília, DF 0800-6020097

www.tecnisys.com.br


Em 21/05/2023 14:15, Wells Oliver escreveu:

I am seeing a lot of this morning:
 
ERROR: could not extend file "base/16411/616989679.62": No space left on devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated; reinitializing
 
2023-05-21 17:04:02 UTC:172.31.21.22(55060):woliver@db:[31430]:ERROR: could not extend file "base/16411/616989585.45": No space left on device
 
Etc
 
However, when I can connect, the DB is well below the provisioned space.
 
I am trying to figure this out but not sure where to turn yet.
 
--
Вложения

Re: RDS No free space

От
Priancka Chatz
Дата:
Hi Oliver,

I think for RDS Postgres space issue this is good guidance: https://repost.aws/knowledge-center/diskfull-error-rds-postgresql .

Regards,
Priyanka

On Mon, May 22, 2023 at 2:49 PM Wilson Coelho <wilson.coelho@tecnisys.com.br> wrote:

Hi Oliver

You need to check where are the postgres data directory and verify if there is enough available space on it to the operation you trying to do over the database.
It seems to be a problem related with a partition scheme and not the DB per se.
Also take a look at the distribution of your data over the partitions. If that file is on a partition with no space, try to resize or, move to other partition

Regards

Wilson Coelho

---

Wilson Moraes Coelho
Especialista

Logo
Tecnisys

Sia Trecho 08, lotes 245 / 255 / 265 ||

Tel.:+55 (61) 3039-9700 - (61) 99989-8932

71205-080 || Guará || Brasília, DF 0800-6020097

www.tecnisys.com.br


Em 21/05/2023 14:15, Wells Oliver escreveu:

I am seeing a lot of this morning:
 
ERROR: could not extend file "base/16411/616989679.62": No space left on devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated; reinitializing
 
2023-05-21 17:04:02 UTC:172.31.21.22(55060):woliver@db:[31430]:ERROR: could not extend file "base/16411/616989585.45": No space left on device
 
Etc
 
However, when I can connect, the DB is well below the provisioned space.
 
I am trying to figure this out but not sure where to turn yet.
 
--
Вложения

Re: RDS No free space

От
Jeff Janes
Дата:
On Sun, May 21, 2023 at 1:15 PM Wells Oliver <wells.oliver@gmail.com> wrote:
I am seeing a lot of this morning:

ERROR: could not extend file "base/16411/616989679.62": No space left on devi2023-05-21 17:11:45 UTC::@:[386]:LOG: all server processes terminated; reinitializing

2023-05-21 17:04:02 UTC:172.31.21.22(55060):woliver@db:[31430]:ERROR: could not extend file "base/16411/616989585.45": No space left on device

The 'all server processes terminated' message indicates that there has been a PANIC, not merely an ERROR.  Was there also an earlier PANIC message?




Etc

However, when I can connect, the DB is well below the provisioned space.

Do objects with those relfilenodes exist? What statements were executing when the errors were thrown? (the statement text should be included in the error message)

Cheers,

Jeff