Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker
От | Tom Lane |
---|---|
Тема | Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker |
Дата | |
Msg-id | 26015.1574441403@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker
|
Список | pgsql-bugs |
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > On Fri, Nov 22, 2019 at 01:54:01AM +0100, Ondřej Jirman wrote: >> On Thu, Nov 21, 2019 at 07:45:34PM -0500, Tom Lane wrote: >>> =?utf-8?Q?Ond=C5=99ej?= Jirman <ienieghapheoghaiwida@xff.cz> writes: >>>> newtup.changed >>>> {true, true, false, true, true, true, true, true, false <repeats 1656 times>} >>> So column 3 is not getting replaced. > Can you show us the attribute list as defined in the system, including > e.g. dropped columns? That is, something like > SELECT attnum, attname, atttypid FROM pg_attribute > WHERE attrelid = 'public.videos'::regclass; > both from the published and subscriber. After digging around a bit more in the logrep code, it seems like we'd only have a situation with newtup.changed[i] = false if the publisher had hit this bit in logicalrep_write_tuple: else if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(values[i])) { pq_sendbyte(out, 'u'); /* unchanged toast column */ continue; } So this seems to indicate that Ondřej's case involves an update on a row that contained a toasted-out-of-line column that did not get updated. That's different from the trigger conditions that I postulated, but the end result is the same: slot_modify_cstrings would have a case where it's supposed to retain the old data for a pass-by-ref column, and it'd fail to do that correctly. I also discovered that before v12, the calling code looked like this: ExecStoreTuple(localslot->tts_tuple, remoteslot, InvalidBuffer, false); slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed); so that at that time, "remoteslot" indeed did not have its own local copy of the tuple, and thus the code failed to fail. Perhaps this was intentional on the part of whoever wrote this, but it's still an unacceptable abuse of the API if you ask me. I added some tests involving dropped columns to what I had last night and pushed it. However, looking at this now, I see that we really still don't have enough coverage --- the code path I showed above is not hit by the regression tests, according to the code coverage bot. I don't think it's really acceptable for such a corner case bit of the logrep protocol to not get tested :-( regards, tom lane
В списке pgsql-bugs по дате отправления: