Обсуждение: Archiving solutions
Dears ,
Appreciate your support I have 2.4 TB database and I need to implement archiving solutions to prevent data growthing.
Thanks
Hi,
Plan to use table partitioning and do regular housekeeping of the cluster.
Thanks,
Imran
On Sun, Jan 12, 2025, 11:42 PM kasem adel <kasemadel8@gmail.com> wrote:
Dears ,Appreciate your support I have 2.4 TB database and I need to implement archiving solutions to prevent data growthing.Thanks
COPY exists, and is very fast.
1. have a supporting index, and
2. don't try to delete too much in one statement.
Those are the only PG comments that I needed to archive a 6TB DB down to 2TB.
On Sun, Jan 12, 2025 at 3:42 PM kasem adel <kasemadel8@gmail.com> wrote:
Dears ,Appreciate your support I have 2.4 TB database and I need to implement archiving solutions to prevent data growthing.Thanks
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
What are regular housekeeping activities @Imran?
On Mon, 13 Jan 2025, 07:27 Ron Johnson, <ronljohnsonjr@gmail.com> wrote:
COPY exists, and is very fast.In my experience, DELETE is quite fast when you:1. have a supporting index, and2. don't try to delete too much in one statement.Those are the only PG comments that I needed to archive a 6TB DB down to 2TB.On Sun, Jan 12, 2025 at 3:42 PM kasem adel <kasemadel8@gmail.com> wrote:Dears ,Appreciate your support I have 2.4 TB database and I need to implement archiving solutions to prevent data growthing.Thanks--Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.<Redacted> lobster!
Hi Rajesh,
Housekeeping activities in PostgreSQL are essential to ensure the database remains high performing, secure, and reliable over time. Below is a categorized list of key housekeeping tasks:
---
1. Routine Maintenance Tasks
Vacuuming:
Use VACUUM to reclaim storage and update visibility maps.
Run VACUUM FULL for aggressive storage reclamation (requires downtime).
Analyze Statistics:
Run ANALYZE to update query planner statistics for efficient query execution.
Use autovacuum for automatic vacuum and analyze.
Reindexing:
Use REINDEX to rebuild corrupted or bloated indexes.
Schedule periodic reindexing for heavily updated tables.
---
2. Backup and Recovery
Regular Backups:
Use pg_basebackup or pg_dump for full backups.
Implement incremental backups if using third-party tools (e.g., pgBackRest, Barman).
Test Recovery:
Periodically restore backups in a test environment to ensure recovery readiness.
---
3. Monitoring and Logging
Review Logs:
Check PostgreSQL logs for errors, slow queries, and unusual activities.
Adjust log_min_duration_statement to capture slow queries.
Monitor Performance Metrics:
Use tools like pg_stat_activity, pg_stat_replication, and pg_stat_progress_vacuum.
Implement monitoring solutions (e.g., pgAdmin, Prometheus/Grafana, pgwatch2).
---
4. Database Optimization
Index Maintenance:
Identify unused or bloated indexes using pg_stat_user_indexes.
Drop unused indexes and optimize queries.
Query Optimization:
Use EXPLAIN or EXPLAIN ANALYZE to analyze query performance.
Optimize slow queries by rewriting or indexing.
Partitioning:
Use table partitioning for large datasets to improve performance and manageability.
---
5. Configuration Management
Parameter Tuning:
Adjust key parameters like work_mem, maintenance_work_mem, effective_cache_size, and max_connections.
Regularly review and tune configuration files (postgresql.conf and pg_hba.conf).
Autovacuum Settings:
Ensure autovacuum is enabled and tuned for your workload.
Monitor pg_stat_autovacuum for activity.
---
6. Security and User Management
User Roles and Permissions:
Regularly review roles and privileges.
Remove unused accounts and enforce strong passwords.
SSL/TLS Management:
Enable SSL for secure connections.
Rotate certificates periodically.
Audit Logs:
Use extensions like pgAudit for detailed logging of user activities.
---
7. Archiving and WAL Management
WAL Archiving:
Enable archive_mode and configure archive_command for point-in-time recovery (PITR).
Cleanup Old WAL Files:
Ensure old WAL files are deleted or archived properly to free up storage.
---
8. Database Growth Management
Table and Index Bloat Analysis:
Use pgstattuple or pg_repack to identify and manage bloat.
Disk Usage Monitoring:
Regularly monitor disk usage with pg_database_size() and pg_relation_size().
---
9. Updates and Patches
Apply Updates:
Keep PostgreSQL updated with the latest patches for bug fixes and security.
Extension Updates:
Update installed extensions (e.g., PostGIS, pglogical) to their latest versions.
---
10. Miscellaneous Tasks
Replication Monitoring:
Monitor replication lag and the health of replicas using pg_stat_replication.
System Resource Monitoring:
Monitor CPU, memory, and I/O usage to ensure the system is not bottlenecked.
Disk Defragmentation:
Perform disk defragmentation (OS level) if necessary.
---
These activities, when performed periodically, will help keep your PostgreSQL database healthy and performing optimally. Automating some of these tasks via scripts or tools can further reduce manual intervention.
On Mon, Jan 13, 2025, 7:33 AM Rajesh Kumar <rajeshkumar.dba09@gmail.com> wrote:
What are regular housekeeping activities @Imran?
On Mon, 13 Jan 2025, 07:27 Ron Johnson, <ronljohnsonjr@gmail.com> wrote:COPY exists, and is very fast.In my experience, DELETE is quite fast when you:1. have a supporting index, and2. don't try to delete too much in one statement.Those are the only PG comments that I needed to archive a 6TB DB down to 2TB.On Sun, Jan 12, 2025 at 3:42 PM kasem adel <kasemadel8@gmail.com> wrote:Dears ,Appreciate your support I have 2.4 TB database and I need to implement archiving solutions to prevent data growthing.Thanks--Death to <Redacted>, and butter sauce.Don't boil me, I'm still alive.<Redacted> lobster!
Dear imeran ,
Thanks for your support.
Dears,
Appreciate your support in simple solution to implement it.
Thanks
في الأحد، ١٢ يناير ٢٠٢٥ ١١:٢٣ م Imran Khan <imran.k.23@gmail.com> كتب:
Hi,How the data can be stored. Do you have historical data? If yes , then you can keep that data in archived tables or if that data is still required by business then store it in partitions monthly wise . You need to design the data storage with proper planning. Also, if you have dynamic and ever changing data and you have date records then store with respect to partitions and retain for last 3 or 6 months then move to archived tables or purge if not required .. keep track of indexes and capture long running SQL statements through pg_stat_statements extension and run explain plan to get the proper indexes created based on the output of that plan. Also, keep track of memory related parameters to check if any memory leak occurs resulting in EXCESSIVE swap space usage on OS.That's my opinion but we have more experts here who can help us to understand more.Thanks,ImranOn Sun, Jan 12, 2025, 11:57 PM kasem adel <kasemadel8@gmail.com> wrote:Hi,What is the regular housekeeping.Thanksفي الأحد، ١٢ يناير ٢٠٢٥ ١٠:٥٥ م Imran Khan <imran.k.23@gmail.com> كتب:Hi,Plan to use table partitioning and do regular housekeeping of the cluster.Thanks,ImranOn Sun, Jan 12, 2025, 11:42 PM kasem adel <kasemadel8@gmail.com> wrote:Dears ,Appreciate your support I have 2.4 TB database and I need to implement archiving solutions to prevent data growthing.Thanks
Thank you Imran
On Tue, 14 Jan 2025, 04:38 kasem adel, <kasemadel8@gmail.com> wrote:
Dear imeran ,Thanks for your support.Dears,Appreciate your support in simple solution to implement it.Thanksفي الأحد، ١٢ يناير ٢٠٢٥ ١١:٢٣ م Imran Khan <imran.k.23@gmail.com> كتب:Hi,How the data can be stored. Do you have historical data? If yes , then you can keep that data in archived tables or if that data is still required by business then store it in partitions monthly wise . You need to design the data storage with proper planning. Also, if you have dynamic and ever changing data and you have date records then store with respect to partitions and retain for last 3 or 6 months then move to archived tables or purge if not required .. keep track of indexes and capture long running SQL statements through pg_stat_statements extension and run explain plan to get the proper indexes created based on the output of that plan. Also, keep track of memory related parameters to check if any memory leak occurs resulting in EXCESSIVE swap space usage on OS.That's my opinion but we have more experts here who can help us to understand more.Thanks,ImranOn Sun, Jan 12, 2025, 11:57 PM kasem adel <kasemadel8@gmail.com> wrote:Hi,What is the regular housekeeping.Thanksفي الأحد، ١٢ يناير ٢٠٢٥ ١٠:٥٥ م Imran Khan <imran.k.23@gmail.com> كتب:Hi,Plan to use table partitioning and do regular housekeeping of the cluster.Thanks,ImranOn Sun, Jan 12, 2025, 11:42 PM kasem adel <kasemadel8@gmail.com> wrote:Dears ,Appreciate your support I have 2.4 TB database and I need to implement archiving solutions to prevent data growthing.Thanks