Re: Merge algorithms for large numbers of "tapes"
От | Zeugswetter Andreas DCP SD |
---|---|
Тема | Re: Merge algorithms for large numbers of "tapes" |
Дата | |
Msg-id | E1539E0ED7043848906A8FF995BDA579D99381@m0143.s-mxs.net обсуждение исходный текст |
Ответ на | Merge algorithms for large numbers of "tapes" (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Merge algorithms for large numbers of "tapes"
|
Список | pgsql-hackers |
> Two pass will create the count of subfiles proportional to: > Subfile_count = original_stream_size/sort_memory_buffer_size > > The merge pass requires (sizeof record * subfile_count) memory. That is true from an algorithmic perspective. But to make the merge efficient you would need to have enough RAM to cache a reasonably large block per subfile_count. Else you would need to reread the same page/block from one subfile multiple times. (If you had one disk per subfile you could also rely on the disk's own cache, but I think we can rule that out) > Example: > You have a 7 gigabyte table to sort and you have 100 MB sort buffer. > The number of subfiles will be: > 7000000000 / 100000000 = 70 files To be efficient you need (70 + 1) \* max(record_size, 256k) = 18 Mb Plus you need a structure per subfile that points to the current record in the buffer. Andreas
В списке pgsql-hackers по дате отправления: