Обсуждение: The other major HS TODO: standby promotion

Поиск
Список
Период
Сортировка

The other major HS TODO: standby promotion

От
Josh Berkus
Дата:
All,

As far as I'm concerned, the other big "missing feature" for HS is the
ability to promote standbys to become the new master.   If we had that
feature, then HS can be the backbone of a large-scale PostgreSQL
"cloud"; if we don't have it, then HS does not contribute very much to
scalability beyond a couple of servers.

It also seems like it ought to be relatively easy to do, if I understand
the issues correctly.  Please advise me if I understand the two
obstacles for this:

a) for a standby to become a master to another standby, the promoted
standby must be equal to or ahead of the nonpromoted standby in the
replication stream.

b) when we bring a standby up, it comes up on a new timeline.  Since the
other standbys don't have this new timeline, they are incompatible with it.

c) when we promote a standby, it would also need to save all of its
transaction logs until the other standbys connect.

(a) seems easily enough solved by giving two steps: giving the DBA a way
to check where in the replication stream each standby is (I think we
already have this) and by having the re-mastering mechanism check for
regressions in timestamps or the XID sequence.

I can see two ways to tackle (b).  One would be NOT to start a new
timeline (as an option) when we promote the standby.  That method
probably has complications we don't want to get into.

The second method would be by giving standbys a way to "subscribe" to a
new timeline.  This seems like the better approach, as it would
logically be part of the re-mastering command.  What changes would be
required to do this?

(c) can actually already be dealt with by setting an archive_command on
each standby.  Beyond that, I don't think that we really need to do
anything; DBAs can have a choice between archiving logs to allow for
remastering of all standbys, or saving space and bandwidth, and forcing
some standbys to be re-cloned if you run out of time.  It would be nice,
eventually, to have a way to tell PostgreSQL to retain more or less WAL
segments without restarting the server, but I don't see this as critical.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: The other major HS TODO: standby promotion

От
Itagaki Takahiro
Дата:
On Sat, Sep 4, 2010 at 5:02 AM, Josh Berkus <josh@agliodbs.com> wrote:
> As far as I'm concerned, the other big "missing feature" for HS is the
> ability to promote standbys to become the new master.

Absolutely. Users often express disappointment when they know
a new base backup required after fail over.

> b) when we bring a standby up, it comes up on a new timeline.  Since the
> other standbys don't have this new timeline, they are incompatible with it.
>
> The second method would be by giving standbys a way to "subscribe" to a
> new timeline.  This seems like the better approach, as it would
> logically be part of the re-mastering command.  What changes would be
> required to do this?

On my test, standby servers succeeded to subscribe the new master
when I set "recovery_target_timeline" to the new master's one.
It actually worked in some cases, but I'm not sure in all cases.

--
Itagaki Takahiro


Re: The other major HS TODO: standby promotion

От
Fujii Masao
Дата:
On Sat, Sep 4, 2010 at 5:02 AM, Josh Berkus <josh@agliodbs.com> wrote:
> (a) seems easily enough solved by giving two steps: giving the DBA a way
> to check where in the replication stream each standby is (I think we
> already have this)

Yep, pg_last_xlog_receive_location would help.

> The second method would be by giving standbys a way to "subscribe" to a
> new timeline.  This seems like the better approach, as it would
> logically be part of the re-mastering command.  What changes would be
> required to do this?

Wait for new master to archive the timeline history file, set
recovery_target_timeline to 'latest' in unpromoted standbys and
restart them. Which would make them restore WAL files with previous
timeline from the archive and read WAL files with current one.

> (c) can actually already be dealt with by setting an archive_command on
> each standby.  Beyond that, I don't think that we really need to do
> anything; DBAs can have a choice between archiving logs to allow for
> remastering of all standbys, or saving space and bandwidth, and forcing
> some standbys to be re-cloned if you run out of time.  It would be nice,
> eventually, to have a way to tell PostgreSQL to retain more or less WAL
> segments without restarting the server, but I don't see this as critical.

Or the register/unregister of standbys facility is required?
http://archives.postgresql.org/pgsql-hackers/2010-08/msg01984.php

And we would need to change primary_conninfo in all the unpromoted
standbys before restarting them.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center