Re: New WAL code dumps core trivially on replay of bad data

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: New WAL code dumps core trivially on replay of bad data
Дата	20 августа 2012 г. 14:05:32
Msg-id	28954.1345471492@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: New WAL code dumps core trivially on replay of bad data (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Ответы	Re: New WAL code dumps core trivially on replay of bad data Re: New WAL code dumps core trivially on replay of bad data
Список	pgsql-hackers

Дерево обсуждения

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> On 18.08.2012 08:52, Amit kapila wrote:
>> I think that missing check of total length has caused this problem. However now this check will be different.

> That check still exists, in ValidXLogRecordHeader(). However, we now 
> allocate the buffer for the whole record before that check, based on 
> xl_tot_len, if the record header is split across pages. The theory in 
> allocating the buffer is that a bogus xl_tot_len field will cause the 
> malloc() to fail, returning NULL, and we treat that the same as a broken 
> header.

Uh, no, you misread it.  xl_tot_len is *zero* in this example.  The
problem is that RecordIsValid believes xl_len (and backup block size)
even when it exceeds xl_tot_len.

> I think we need to delay the allocation of the record buffer. We need to 
> read and validate the whole record header first, like we did before, 
> before we trust xl_tot_len enough to call malloc() with it. I'll take a 
> shot at doing that.

I don't believe this theory at all.  Overcommit applies to writing on
pages that were formerly shared with the parent process --- it should
not have anything to do with malloc'ing new space.  But anyway, this
is not what happened in my example.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: New WAL code dumps core trivially on replay of bad data