Re: database size growing continously
От | Jeremy Harris |
---|---|
Тема | Re: database size growing continously |
Дата | |
Msg-id | 4AEB4A25.7080701@wizmail.org обсуждение исходный текст |
Ответ на | Re: database size growing continously (Greg Stark <gsstark@mit.edu>) |
Ответы |
Re: database size growing continously
|
Список | pgsql-performance |
On 10/30/2009 08:01 PM, Greg Stark wrote: > On Fri, Oct 30, 2009 at 12:53 PM, Anj Adu<fotographs@gmail.com> wrote: >> Any relational database worth its salt has partitioning for a reason. >> >> 1. Maintenance. You will need to delete data at some >> point.(cleanup)...Partitions are the only way to do it effectively. > > This is true and it's unavoidably a manual process. The database will > not know what segments of the data you intend to load and unload en > masse. > >> 2. Performance. Partitioning offer a way to query smaller slices of >> data automatically (i.e the query optimizer will choose the partition >> for you) ...very large tables are a no-no in any relational >> database.(sheer size has limitations) > > This I dispute. Databases are designed to be scalable and very large > tables should perform just as well as smaller tables. > > Where partitions win for performance is when you know something about > how your data is accessed and you can optimize the access by > partitioning along the same keys. For example if you're doing a > sequential scan of just one partition or doing a merge join of two > equivalently partitioned tables and the partitions can be sorted in > memory. > > However in these cases it is possible the database will become more > intelligent and be able to achieve the same performance gains > automatically. Bitmap index scans should perform comparably to the > sequential scan of individual partitions for example. > So, on the becoming more intelligent front: PostgreSQL already does some operations as background maintenance (autovacuum). Extending this to de-bloat indices does not seem conceptually impossible, nor for the collection of table-data statistics for planner guidance (also, why could a full-table-scan not collect stats as a side-effect?). Further out, how about the gathering of statistics on queries to guide the automatic creation of indices? Or to set up a partitioning scheme on a previously monolithic table? - Jeremy
В списке pgsql-performance по дате отправления: