Re: How to best use 32 15k.7 300GB drives?
От | Vitalii Tymchyshyn |
---|---|
Тема | Re: How to best use 32 15k.7 300GB drives? |
Дата | |
Msg-id | 4D4BC491.7080908@gmail.com обсуждение исходный текст |
Ответ на | Re: How to best use 32 15k.7 300GB drives? (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: How to best use 32 15k.7 300GB drives?
|
Список | pgsql-performance |
03.02.11 20:42, Robert Haas написав(ла): > 2011/1/30 Віталій Тимчишин<tivv00@gmail.com>: >> I was thinking if a table file could be deleted if it has no single live >> row. And if this could be done by vacuum. In this case vacuum on table that >> was fully updated recently could be almost as good as cluster - any scan >> would skip such non-existing files really fast. Also almost no disk space >> would be wasted. > VACUUM actually already does something along these lines. If there > are 1 or any larger number of entirely-free pages at the end of a > table, VACUUM will truncate them away. In the degenerate case where > ALL pages are entirely-free, this results in zeroing out the file. > > The problem with this is that it rarely does much. Consider a table > with 1,000,000 pages, 50% of which contain live rows. On average, how > many pages will this algorithm truncate away? Answer: if the pages > containing live rows are randomly distributed, approximately one. Yes, but take into account operations on a (by different reasons) clustered tables, like removing archived data (yes I know, this is best done with partitioning, but one must still go to a point when he will decide to use partitioning :) ). > Your idea of having a set of heaps rather than a single heap is an > interesting one, but it's pretty much catering to the very specific > case of a full-table update. I think the code changes needed would be > far too invasive to seriously contemplate doing it just for that one > case - although it is an important case that I would like to see us > improve. Why do you expect such a invasive code changes? I know little about postgresql code layering, but what I propose (with changing delete to truncate) is: 1) Leave tuple addressing as it is now 2) Allow truncated files, treating non-existing part as if it contained not used tuples 3) Make vacuum truncate file if it has not used tuples at the end. The only (relatively) tricky thing I can see is synchronizing truncation with parallel ongoing scan. Best regards, Vitalii Tymchyshyn
В списке pgsql-performance по дате отправления: