Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker
От | Ondřej Jirman |
---|---|
Тема | Re: BUG #16129: Segfault in tts_virtual_materialize in logicalreplication worker |
Дата | |
Msg-id | 20191122173204.hevpm4hgpk2tiyl6@core.my.home обсуждение исходный текст |
Ответ на | Re: BUG #16129: Segfault in tts_virtual_materialize in logical replication worker (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-bugs |
On Fri, Nov 22, 2019 at 11:50:03AM -0500, Tom Lane wrote: > Tomas Vondra <tomas.vondra@2ndquadrant.com> writes: > > On Fri, Nov 22, 2019 at 01:54:01AM +0100, Ondřej Jirman wrote: > >> On Thu, Nov 21, 2019 at 07:45:34PM -0500, Tom Lane wrote: > >>> =?utf-8?Q?Ond=C5=99ej?= Jirman <ienieghapheoghaiwida@xff.cz> writes: > >>>> newtup.changed > >>>> {true, true, false, true, true, true, true, true, false <repeats 1656 times>} > > >>> So column 3 is not getting replaced. > > > Can you show us the attribute list as defined in the system, including > > e.g. dropped columns? That is, something like > > SELECT attnum, attname, atttypid FROM pg_attribute > > WHERE attrelid = 'public.videos'::regclass; > > both from the published and subscriber. > > After digging around a bit more in the logrep code, it seems like we'd > only have a situation with newtup.changed[i] = false if the publisher > had hit this bit in logicalrep_write_tuple: > > else if (att->attlen == -1 && VARATT_IS_EXTERNAL_ONDISK(values[i])) > { > pq_sendbyte(out, 'u'); /* unchanged toast column */ > continue; > } > > So this seems to indicate that Ondřej's case involves an update on a row > that contained a toasted-out-of-line column that did not get updated. > That's different from the trigger conditions that I postulated, but the > end result is the same: slot_modify_cstrings would have a case where > it's supposed to retain the old data for a pass-by-ref column, and it'd > fail to do that correctly. Yes, it was UPDATE videos SET played = TRUE; on a row that had 38kB BYTEA value in cover_image. Thank you both, Tom and Tomas, for your help. regards, o. > I also discovered that before v12, the calling code looked like this: > > ExecStoreTuple(localslot->tts_tuple, remoteslot, InvalidBuffer, false); > slot_modify_cstrings(remoteslot, rel, newtup.values, newtup.changed); > > so that at that time, "remoteslot" indeed did not have its own local > copy of the tuple, and thus the code failed to fail. Perhaps this was > intentional on the part of whoever wrote this, but it's still an > unacceptable abuse of the API if you ask me. > > I added some tests involving dropped columns to what I had last night > and pushed it. However, looking at this now, I see that we really > still don't have enough coverage --- the code path I showed above > is not hit by the regression tests, according to the code coverage > bot. I don't think it's really acceptable for such a corner case > bit of the logrep protocol to not get tested :-( > > regards, tom lane
В списке pgsql-bugs по дате отправления: