I think the point that you can access more system cache is right but that doesn't mean it will be more efficient than accessing from your local disk. Take Hadoop for example, your request for file content will have to go to Namenode (file chunk indexing service) and then you go ask the data node which then provides you data. Assuming that you're working on a large dataset, the probability of the data chunk you need staying in system cache is very low therefore most of the time you end up reading from a remote disk.
I've got a better idea. How about we make the buffer pool multilevel? The first level is the current one. The second level represents memory from remote machines. Things that are used less often should stay on the second level. Has anyone ever thought about something like this before?
On Sun, Feb 22, 2009 at 5:18 PM, pi song <pi.songs@gmail.com> wrote: > One more problem is that data placement on HDFS is inherent, meaning you > have no explicit control. Thus, you cannot place two sets of data which are > likely to be joined together on the same node = uncontrollable latency > during query processing. > Pi Song
It would only be possible to have the actual PostgreSQL backends running on a single node anyway, because they use shared memory to hold lock tables and things. The advantage of a distributed file system would be that you could access more storage (and more system buffer cache) than would be possible on a single system (or perhaps the same amount but at less cost). Assuming some sort of per-tablespace control over the storage manager, you could put your most frequently accessed data locally and the less frequently accessed data into the DFS.
But you'd still have to pull all the data back to the master node to do anything with it. Being able to actually distribute the computation would be a much harder problem. Currently, we don't even have the ability to bring multiple CPUs to bear on (for example) a large sequential scan (even though all the data is on a single node).