database synchronization
От | Chris Jones |
---|---|
Тема | database synchronization |
Дата | |
Msg-id | a5fg0lb5tpt.fsf@merry.mt.sri.com обсуждение исходный текст |
Список | pgsql-general |
After browsing through the online archive of the thread on database replication, I feel somewhat enlightened. I'm wondering if anybody has any bright suggestions to this problem, which is a subset of the one discussed in the thread: We are looking at deploying a database which will need to be synchronized, but not replicated. In other words, we want several database servers to have copies of the data, with all except the master copy being effectively read-only. If the data on a particular slave is a little bit out of date, it's no big loss. Most clients will read data, but a few will also make changes. Here's one possible low-tech approach: * A script will periodically dump the database on the master. This happens inside a transaction, so we should get a consistent dump. * In some kind of round-robin fashion, each slave will check that a sufficient number of other slaves are running, then bring itself down, fetch the dump, and reload it. * A read-only client will grab a database server out of a pool of them and connect to it. * A read-write client will only accept a connection to the master server. The biggest problem I see with such a solution is that it doesn't scale well: As the database grows, so does the time to dump/restore it. Also, it's reasonable to assume that the number of slave servers will grow at roughly the same rate. An alternative solution would be more like this: * On the master, create a sync_log table, which would contain: log_id serial not null primary key, cmd text not null * On the master, create a set of triggers: on insert/update/delete, insert an SQL statement that will perform that operation. * On each slave, keep a record of the log_id of the most recent sync_log entry which was imported from the master. * Periodically, each slave will connect to the master and grab all the new sync_log entries, and then execute them. This scenario has two problems: The first is writing a pl/pgsql procedure that will do all the right quoting and logic to build that SQL statement. The second problem is that this table is going to get large. Realistically, though, I could keep track of the most recent sync_log entry for each slave, and delete old log entries. I've tried to come up with a more graceful, more space-efficient method of doing this, but it turns out to be a fairly complicated problem. Has somebody else done the homework for this? Chris -- ----------------------------------------------------- chris@mt.sri.com Chris Jones SRI International, Inc.
В списке pgsql-general по дате отправления: