Re: Hadoop backend?
От | Hans-Jürgen Schönig |
---|---|
Тема | Re: Hadoop backend? |
Дата | |
Msg-id | 136182E5-BC7E-4AEB-A2E8-4C225B2F9095@cybertec.at обсуждение исходный текст |
Ответ на | Re: Hadoop backend? (pi song <pi.songs@gmail.com>) |
Список | pgsql-hackers |
hi ...
i think the easiest way to do this is to simply add a mechanism to functions which allows a function to "stream" data through.
it would basically mean losing join support as you cannot "read data again" in a way which is good enough good enough for joining with the function providing the data from hadoop.
hannu ( I think) brought up some concept as well some time ago.
i think a straight forward implementation would not be too hard.
best regards,
hans
On Feb 22, 2009, at 3:37 AM, pi song wrote:
1) Hadoop file system is very optimized for mostly read operation2) As of a few months ago, hdfs doesn't support file appending.There might be a bit of impedance to make them go together.However, I think it should a very good initiative to come up with ideas to be able to run postgres on distributed file system (doesn't have to be specific hadoop).Pi SongOn Sun, Feb 22, 2009 at 7:17 AM, Paul Sheer <paulsheer@gmail.com> wrote:Hadoop backend for PostGreSQL....
A problem that my client has, and one that I come across often,
is that a database seems to always be associated with a particular
physical machine, a physical machine that has to be upgraded,
replaced, or otherwise maintained.
Even if the database is replicated, it just means there are two or
more machines. Replication is also a difficult thing to properly
manage.
With a distributed data store, the data would become a logical
object - no adding or removal of machines would affect the data.
This is an ideal that would remove a tremendous maintenance
burden from many sites ---- well, at least the one's I have worked
at as far as I can see.
Does anyone know of plans to implement PostGreSQL over Hadoop?
Yahoo seems to be doing this:
http://glinden.blogspot.com/2008/05/yahoo-builds-two-petabyte-postgresql.html
But they store tables column-ways for their performance situation.
If one is doing a lot of inserts I don't think this is most efficient - ?
Has Yahoo put the source code for their work online?
Many thanks for any pointers.
-paul
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
В списке pgsql-hackers по дате отправления: