Re: New WAL code dumps core trivially on replay of bad data
От | Heikki Linnakangas |
---|---|
Тема | Re: New WAL code dumps core trivially on replay of bad data |
Дата | |
Msg-id | 5032531F.9090007@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: New WAL code dumps core trivially on replay of bad data (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: New WAL code dumps core trivially on replay of bad data
|
Список | pgsql-hackers |
On 20.08.2012 17:04, Tom Lane wrote: > Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> writes: >> On 18.08.2012 08:52, Amit kapila wrote: >>> I think that missing check of total length has caused this problem. However now this check will be different. > >> That check still exists, in ValidXLogRecordHeader(). However, we now >> allocate the buffer for the whole record before that check, based on >> xl_tot_len, if the record header is split across pages. The theory in >> allocating the buffer is that a bogus xl_tot_len field will cause the >> malloc() to fail, returning NULL, and we treat that the same as a broken >> header. > > Uh, no, you misread it. xl_tot_len is *zero* in this example. The > problem is that RecordIsValid believes xl_len (and backup block size) > even when it exceeds xl_tot_len. Ah yes, I see that now. I think all we need then is a check for xl_tot_len >= SizeOfXLogRecord. >> I think we need to delay the allocation of the record buffer. We need to >> read and validate the whole record header first, like we did before, >> before we trust xl_tot_len enough to call malloc() with it. I'll take a >> shot at doing that. > > I don't believe this theory at all. Overcommit applies to writing on > pages that were formerly shared with the parent process --- it should > not have anything to do with malloc'ing new space. But anyway, this > is not what happened in my example. I was thinking that we might read gigabytes worth of bogus WAL into the memory buffer, if xl_tot_len is bogus and large, e.g 0xffffffff. But now that I look closer, the xlog record is validated after reading the first continuation page, so we should catch a bogus xl_tot_len value at that point. And there is a cross-check with xl_rem_len on every continuation page, too. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: