Re: polyphase merge?
От | Tom Lane |
---|---|
Тема | Re: polyphase merge? |
Дата | |
Msg-id | 2245.1233760675@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: polyphase merge? (Greg Stark <stark@enterprisedb.com>) |
Список | pgsql-hackers |
Greg Stark <stark@enterprisedb.com> writes: > Is this basically the same as our current algorithm but without > multiplexing the tapes onto single files? I have been wondering > whether we multiplex the tapes any better than filesystems can lay out > separate files actually. The reason for the multiplexing is so that space can get re-used quickly. If each tape were represented as a separate file, there would be no way to release blocks as they're read; you could only give back the whole file after reaching end of tape. Which would at least double the amount of disk space needed to sort X amount of data. (It's actually even worse, more like 4X, though the multiplier might depend on the number of "tapes" --- I don't recall the details anymore.) The penalty we pay is that in the later merge passes, the blocks representing a single tape aren't very well ordered. It might be interesting to think about some compromise that wastes a little more space in order to get better sequentiality of disk access. It'd be easy to do if we were willing to accept a 2X space penalty, but I'm not sure if that would fly or not. It definitely *wasn't* acceptable to the community a few years ago when the current code was written. Disks have gotten bigger since then, but so have the problems people want to solve. regards, tom lane
В списке pgsql-hackers по дате отправления: