Re: BUG #17055: Logical replication worker crashes when applying update of row that dose not exist in target partiti
От | Tom Lane |
---|---|
Тема | Re: BUG #17055: Logical replication worker crashes when applying update of row that dose not exist in target partiti |
Дата | |
Msg-id | 2138662.1623460441@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #17055: Logical replication worker crashes when applying update of row that dose not exist in target partiti (Michael Paquier <michael@paquier.xyz>) |
Ответы |
Re: BUG #17055: Logical replication worker crashes when applying update of row that dose not exist in target partiti
|
Список | pgsql-bugs |
Michael Paquier <michael@paquier.xyz> writes: > On Fri, Jun 11, 2021 at 08:31:30PM -0400, Tom Lane wrote: >> AFAICS, none of the other callers of sub reload are doing anything >> special to wait for it to take effect. Not sure why this one >> would need more protection. If we do need a fix for that, likely >> sub reload itself had better do it. > Perhaps. I looked around a bit, and I think that we don't need to sweat about this. Certainly we've not identified any buildfarm failures related to reload not happening fast enough. I think the sequence of events is: * pg_ctl will deliver the SIGHUP signal to the postmaster before returning. * Although surely the postmaster might not process that signal instantly, we can expect that it will do so before it accepts another connection request. Since the tests cannot detect whether the reload has happened without issuing a new command (which, in all these tests, involves launching psql or pgbench or pg_basebackup or what-have-you, which needs a new connection), the reload will appear to have happened so far as the postmaster itself is concerned, and therefore also in any new backend it spawns. * This does seem to leave in question whether a pre-existing background process such as a logrep worker will get the word fast enough. However, I think the same principle applies there. The postmaster will have turned around and signaled the worker process while it processes the SIGHUP for itself. While that again doesn't ensure instantaneous update of the worker's state, we can expect that it will notice the SIGHUP before noticing any newly-arrived replication data, which is Good Enough. This argument does assume that we know the worker is idle when we issue the reload ... but we already waited for it to be idle. Perhaps the buildfarm will prove me wrong, but I think we don't have to do anything. And if we did want to do something, what would it be? There is no feedback mechanism to let us observe when the worker processed the signal. >> This coding is also stolen verbatim from other test scripts. >> What is your concern about it exactly? > My main concern are future changes in this test suite that could cause > more queries to generate logs that overlap with this sequence of > tests, falsifying what this test is checking. Oh, I see, you wonder if there could already be entries like these in the file. But since we only turned the log level up for the duration of the specific test, it doesn't seem like a problem. (I did think of this to the extent of making sure the second test group in 013_partition.pl looks for different messages than the first group does.) regards, tom lane
В списке pgsql-bugs по дате отправления: