Re: Proposal: "Causal reads" mode for load balancing reads without stale data
От | Thomas Munro |
---|---|
Тема | Re: Proposal: "Causal reads" mode for load balancing reads without stale data |
Дата | |
Msg-id | CAEepm=1UrDptgt+GCzrWvzQ79ELqKoSeGnOdiSaMUVvpPdwh0w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Proposal: "Causal reads" mode for load balancing reads without stale data (Michael Paquier <michael.paquier@gmail.com>) |
Ответы |
Re: Proposal: "Causal reads" mode for load balancing reads
without stale data
|
Список | pgsql-hackers |
On Tue, Mar 29, 2016 at 2:28 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Mon, Mar 28, 2016 at 10:08 PM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> On Tue, Mar 29, 2016 at 1:56 AM, Thomas Munro >> <thomas.munro@enterprisedb.com> wrote: >>> On Mon, Mar 28, 2016 at 8:54 PM, Michael Paquier >>> <michael.paquier@gmail.com> wrote: >>>> I have been also thinking a lot about this patch, and the fact that >>>> the WAL receiver latch is being used within the internals of >>>> libpqwalreceiver has been bugging me a lot, because this makes the >>>> wait phase happening within the libpqwalreceiver depend on something >>>> that only the WAL receiver had a only control on up to now (among the >>>> things thought: having a second latch for libpqwalreceiver, having an >>>> event interface for libpqwalreceiver, switch libpq_receive into being >>>> asynchronous...). >>> >>> Yeah, it bugs me too. Do you prefer this? >>> >>> int walrcv_receive(char **buffer, int *wait_fd); >>> >>> Return value -1 means end-of-copy as before, return value 0 means "no >>> data available now, please call me again when *wait_fd is ready to >>> read". Then walreceiver.c can look after the WaitLatchOrSocket call >>> and deal with socket readiness, postmaster death, timeout and latch, >>> and libpqwalreceiver.c doesn't know anything about all that stuff >>> anymore, but it is now part of the interface that it must expose a >>> file descriptor for readiness testing when it doesn't have data >>> available. >>> >>> Please find attached a new patch series which does it that way. >> >> Oops, there is a bug in the primary disconnection case when len == 1 >> and it breaks out of the loop and wait_fd is invalid. I'll follow up >> on that tomorrow, but I'm interested to hear your thoughts (and anyone >> else's!) on that interface change and general approach. > > I definitely prefer that, that's neater! libpq_select could be > simplified because a timeout does not matter much. Ok, here is a new version that exits the streaming loop correctly when endofwal becomes true. To hit that codepath you have to set up a cascading standby with recovery_target_timeline = 'latest', and then promote the standby it's talking to. I also got rid of the PostmasterIsAlive() check which became superfluous. You're right that libpq_select is now only ever called with timeout = -1 so could theoretically lose the parameter, but I decided against cluttering this patch up by touching that for now. It seems like the only reason it's used by libpqrcv_PQexec is something to do with interrupts on Windows, which I'm not able to test so that was another reason not to touch it. (BTW, isn't the select call in libpq_select lacking an exceptfds set, and can't it therefore block forever when there is an error condition on the socket and no timeout?) -- Thomas Munro http://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления: