Re: Partitioning / Clustering
От | Alex Stapleton |
---|---|
Тема | Re: Partitioning / Clustering |
Дата | |
Msg-id | 57AE769C-381F-4757-95FB-409001758AB6@advfn.com обсуждение исходный текст |
Ответ на | Re: Partitioning / Clustering (Alex Stapleton <alexs@advfn.com>) |
Список | pgsql-performance |
On 11 May 2005, at 09:50, Alex Stapleton wrote: > > On 11 May 2005, at 08:57, David Roussel wrote: > > >> For an interesting look at scalability, clustering, caching, etc >> for a >> large site have a look at how livejournal did it. >> http://www.danga.com/words/2004_lisa/lisa04.pdf >> > > I have implemented similar systems in the past, it's a pretty good > technique, unfortunately it's not very "Plug-and-Play" as you have > to base most of your API on memcached (I imagine MySQLs NDB tables > might work as well actually) for it to work well. > > >> They have 2.6 Million active users, posting 200 new blog entries per >> minute, plus many comments and countless page views. >> >> Although this system is of a different sort to the type I work on >> it's >> interesting to see how they've made it scale. >> >> They use mysql on dell hardware! And found single master >> replication did >> not scale. There's a section on multimaster replication, not sure if >> they use it. The main approach they use is to parition users into >> spefic database clusters. Caching is done using memcached at the >> application level to avoid hitting the db for rendered pageviews >> > > I don't think they are storing pre-rendered pages (or bits of) in > memcached, but are principally storing the data for the pages in > it. Gluing pages together is not a hugely intensive process usually :) > The only problem with memcached is that the clients clustering/ > partitioning system will probably break if a node dies, and > probably get confused if you add new nodes onto it as well. Easily > extensible clustering (no complete redistribution of data required > when you add/remove nodes) with the data distributed across nodes > seems to be nothing but a pipe dream right now. > > >> It's interesting that the solution livejournal have arrived at is >> quite >> similar in ways to the way google is set up. >> > > Don't Google use indexing servers which keep track of where data > is? So that you only need to update them when you add or move data, > deletes don't even have to be propagated among indexes immediately > really because you'll find out if data isn't there when you visit > where it should be. Or am I talking crap? That will teach me to RTFA first ;) Ok so LJ maintain an index of which cluster each user is on, kinda of like google do :) > >> David >> >> ---------------------------(end of >> broadcast)--------------------------- >> TIP 8: explain analyze is your friend >> >> >> > > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > >
В списке pgsql-performance по дате отправления: