Re: Proposal: "Causal reads" mode for load balancing reads without stale data
От | Thomas Munro |
---|---|
Тема | Re: Proposal: "Causal reads" mode for load balancing reads without stale data |
Дата | |
Msg-id | CAEepm=1Z-E4X8wXEY_VhHAZ=AhN1-Xsj4pFfQycjjMmW5+MRZA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Proposal: "Causal reads" mode for load balancing reads without stale data (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Proposal: "Causal reads" mode for load balancing reads
without stale data
|
Список | pgsql-hackers |
On Wed, Mar 30, 2016 at 2:36 PM, Robert Haas <robertmhaas@gmail.com> wrote: > OK, I committed this, with a few tweaks. In particular, I added a > flag variable instead of relying on "latch set" == "need to send > reply"; the other changes were cosmetic. > > I'm not sure how much more of this we can realistically get into 9.6; > the latter patches haven't had much review yet. But I'll set this > back to Needs Review in the CommitFest and we'll see where we end up. > But even if we don't get anything more than this, it's still rather > nice: remote_apply turns out to be only slightly slower than remote > flush, and it's a guarantee that a lot of people are looking for. Thank you Michael and Robert! Please find attached the rest of the patch series, rebased against master. The goal of the 0002 patch is to provide an accurate indication of the current replay lag on each standby, visible to users like this: postgres=# select application_name, replay_lag from pg_stat_replication; application_name │ replay_lag ──────────────────┼───────────────── replica1 │ 00:00:00.000299 replica2 │ 00:00:00.000323 replica3 │ 00:00:00.000319 replica4 │ 00:00:00.000303 (4 rows) It works by maintaining a buffer of (end of WAL, time now) samples received from the primary, and then eventually feeding those times back to the primary when the recovery process replays the corresponding locations. Compared to approaches based on commit timestamps, this approach has the advantage of providing non-misleading information between commits. For example, if you run a batch load job that takes 1 minute to insert the whole phonebook and no other transactions run, you will see replay_lag updating regularly throughout that minute, whereas typical commit timestamp-only approaches will show an increasing lag time until a commit record is eventually applied. Compared to simple LSN location comparisons, it reports in time rather than bytes of WAL, which can be more meaningful for DBAs. When the standby is entirely caught up and there is no write activity, the reported time effectively represents the ping time between the servers, and is updated every wal_sender_timeout / 2, when keepalive messages are sent. While new WAL traffic is arriving, the walreceiver records timestamps at most once per second in a circular buffer, and then sends back replies containing the recorded timestamps as fast as the recovery process can apply the corresponding xlog. The lag number you see is computed by the primary server comparing two timestamps generated by its own system clock, one of which has been on a journey to the standby and back. Accurate lag estimates are a prerequisite for the 0004 patch (about which more later), but I believe users would find this valuable as a feature on its own. -- Thomas Munro http://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления: