Re: [SQL] Questions about vacuum analyze
От | Bruce Momjian |
---|---|
Тема | Re: [SQL] Questions about vacuum analyze |
Дата | |
Msg-id | 199910110312.XAA06290@candle.pha.pa.us обсуждение исходный текст |
Ответ на | Re: [SQL] Questions about vacuum analyze (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [SQL] Questions about vacuum analyze
|
Список | pgsql-sql |
> A further thought: 40M rows could well be more than 4gig of data. > Although Postgres supports tables that large (by segmenting them into > 1gig-apiece files to avoid OS limitations on file size), we currently > do not cope with temporary sort files that exceed 4 gig, which means > a "select distinct" on that much data will fail regardless of whether > you have enough free disk space :-(. We have a TODO item to fix this. > > After looking at psort.c, I see a bunch of other shortcomings, one being > that it's using a polyphase merge algorithm --- which was hot stuff back > when people did this sort of thing on magnetic tape drives, but it's > less than appropriate for sorting on disk. The worst problem with it is > that the space consumption is about 4x the actual data volume, which is > not too cool when you're talking about a huge file to begin with... > > Meanwhile, there is a completely separate implementation of external > sorting over in nbtsort.c (where it's only used for CREATE INDEX, > AFAICS), with a somewhat different set of deficiencies. > > I think I'll add "rewrite sort code from the ground up" to my to-do > list ;-) I have on TODO: Make index creation use psort code, because it is now faster(Vadim) I didn't know sorting algorithms for tape and disk had different optimizations. I figured the paging in of disk blocks had a similar penalty to tape rewinding. None of us really knows a lot about the best algorithm for that job. Nice you recognized it. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
В списке pgsql-sql по дате отправления: