Обсуждение: storage calculations
i've been browsing the documentation online, and i can't find where the storage layouts for tables and indexes are explained. i'd like to be able to figure out, given a table or two with a simple schema (no ineritance) and some indexes, and a number of rows currently in these tables, how much disk space is being used. the underlying problem being worked here, by the way, is to come up with a task that monitors space used by the database so as to issue a warning of some sort when the partition upon which the database resides is getting full. i had originally used the statfs output from the (unix) os, but for this to accurately reflect the "used" space, i found i had to do a "vacuum full" and compress the database. i'd rather come up with a heuristic where i can take a number of rows (which i get from the reltuples column in pg_class, which seems to be accurate after running 'analyze') and multiply by some number i have calculated based on the table schema to get a fairly accurate guess at the real space used. any and all advice welcome, and thanks. pg
<pgboy@guthrie.charm.net> writes: > i've been browsing the documentation online, and i can't > find where the storage layouts for tables and indexes are > explained. > > i'd like to be able to figure out, given a table or two > with a simple schema (no ineritance) and some indexes, and > a number of rows currently in these tables, how much disk > space is being used. oid2name | grep <your_db_name> give you the name of the directory where the database is stored inside your $PGDATA directory. In order to now the name of the file that correspond to your table: oid2name -d <your_db_name> -t <your_table_name> Regards Gaetano Mendola
sorry, when i referred to "disk space used" i meant the actual amount used by live rows. if i insert 1000 rows then delete 999 of them, the disk file will be the size of the 100 row container (make that 1000, not 100 - i cannot backspace here) until i do a "vacuum full" - which does a table lock, which is a bad thing. given that, i'd like to try to be able to calculate the number of bytes a row uses given its schema. i've seen this kind of documentation for other dbs, i just cannot seem to find it in the postgresql docs. thanks. pg
pgboy@guthrie.charm.net writes: > sorry, when i referred to "disk space used" i meant the actual amount > used by live rows. if i insert 1000 rows then delete 999 of them, the > disk file will be the size of the 100 row container (make that 1000, not > 100 - i cannot backspace here) until i do a "vacuum full" - which does a > table lock, which is a bad thing. The task that you originally described was that you want to monitor when the disk is getting full. For that task, you need to take into account the actual size of the data on disk, not the size after a "vacuum full" which you say you do not want to execute. Basing a disk full monitor on hypothetical sizes sounds pretty useless. > given that, i'd like to try to be able to calculate the number of > bytes a row uses given its schema. i've seen this kind of > documentation for other dbs, i just cannot seem to find it > in the postgresql docs. There is some information about that in the FAQ, but keep in mind that rows may be compressed or moved to secondary storage automatically. -- Peter Eisentraut peter_e@gmx.net
On Thu, 31 Jul 2003, Peter Eisentraut wrote:N > pgboy@guthrie.charm.net writes:N > > sorry, when i referred to "disk space used" i meant the actual amount > > used by live rows. if i insert 1000 rows then delete 999 of them, the > > disk file will be the size of the 100 row container (make that 1000, not > > 100 - i cannot backspace here) until i do a "vacuum full" - which does a > > table lock, which is a bad thing. > > The task that you originally described was that you want to monitor when > the disk is getting full. For that task, you need to take into account > the actual size of the data on disk, not the size after a "vacuum full" > which you say you do not want to execute. Basing a disk full monitor on > hypothetical sizes sounds pretty useless. > > > given that, i'd like to try to be able to calculate the number of > > bytes a row uses given its schema. i've seen this kind of > > documentation for other dbs, i just cannot seem to find it > > in the postgresql docs. > > There is some information about that in the FAQ, but keep in mind that > rows may be compressed or moved to secondary storage automatically. well, i can admit that i am confused. my assumption is that when a row is deleted in pg, that row is merely marked as 'gone' until a vacuum-full is done. my further assumption is that if i continue to add rows, those rows are not necessarily appended to the end of the physical data file, but can be written over 'deleted' rows. given that, a vacuum-full is the only way i know of to get an accurate reflection of the number of bytes being used to store the data. without the vacuum, i can tell how big a potentially sparse file is, but i don't want to raise a warning just because the file size is getting large (unless, of course, pg dos not overwrite deleted rows, in which case the warning, or a vaccum-full, seems appropriate. i think i agree with you, too, that i cannot really calculate a hypothetical size, unless i have all fixed-sized fields. in that case, i should be able to accurately calculate the size, yes? if not, what are the variables i could not account for? uh, any more info on your comment "rows may be compressed or moved to secondary storage automatically." i'd *love* to know how to do that. thanks. pgboy
On Thu, Jul 31, 2003 at 08:51:09AM -0400, pgboy@guthrie.charm.net wrote: > > well, i can admit that i am confused. my assumption is that when a row > is deleted in pg, that row is merely marked as 'gone' until a vacuum-full > is done. my further assumption is that if i continue to add rows, those > rows are not necessarily appended to the end of the physical data file, > but can be written over 'deleted' rows. Your assumption is wrong. VACUUM without FULL will allow you to write over the free space now available in your data files, subject to the limitations of tracking as determined by your FSM settings. VACUUM FULL actually shortens the data file. Except for cases having to do with large tuples that won't fit in the previous page, VACUUM FULL means that you can't fit any more data in that file, once the VACUUM FULL is done. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On Thu, 31 Jul 2003, Andrew Sullivan wrote: > On Thu, Jul 31, 2003 at 08:51:09AM -0400, pgboy@guthrie.charm.net wrote: > > > > well, i can admit that i am confused. my assumption is that when a row > > is deleted in pg, that row is merely marked as 'gone' until a vacuum-full > > is done. my further assumption is that if i continue to add rows, those > > rows are not necessarily appended to the end of the physical data file, > > but can be written over 'deleted' rows. > > Your assumption is wrong. VACUUM without FULL will allow you to > write over the free space now available in your data files, subject > to the limitations of tracking as determined by your FSM settings. > > VACUUM FULL actually shortens the data file. Except for cases having > to do with large tuples that won't fit in the previous page, VACUUM > FULL means that you can't fit any more data in that file, once the > VACUUM FULL is done. > > A > > well, i was close. i didn't realize that i had to do a (non-full) vacuum to mark deleted space as free. but after that, i am still left with a potentially sparse file and i don't really have a way to guess how much free space is available until i do a full vacuum, correct? (at which time the file size(s) will indicate the amount being used) just started looking at fsm. thanks for the pointer there. i hope i'm not acting too obtuse here. thanks. pg
On Thu, Jul 31, 2003 at 01:41:54PM -0400, pgboy@guthrie.charm.net wrote: > > well, i was close. i didn't realize that i had to do a (non-full) > vacuum to mark deleted space as free. Ooops, I think I was still unclear. VACUUM FULL actually re-arranges the file, and returns it to the filesystem. Plain VACUUM does indeed leave you with a file that is bigger than the actual data stored there. You can learn how much more data you could fit in the files using VACUUM VERBOSE, keeping in mind that tuples may not always fit in an already-allocated page (like when the tuples are large). > available until i do a full vacuum, correct? (at which time the > file size(s) will indicate the amount being used) That's right, yes. > just started looking at fsm. thanks for the pointer there. i hope > i'm not acting too obtuse here. Nope. If you don't ask, you won't learn about it. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110