Обсуждение: AW: AW: AW: AW: WAL-based allocation of XIDs is insecur e

Поиск
Список
Период
Сортировка

AW: AW: AW: AW: WAL-based allocation of XIDs is insecur e

От
Zeugswetter Andreas SB
Дата:
> After thinking about this a little, I believe I see why Vadim did it
> the way he did.  Suppose we tried to make the code sequence be
> 
>     obtain write lock on buffer;
>     XLogOriginalPage(buffer);   // copy page to xlog if first since ckpt
>     modify buffer;
>     XLogInsert(xlog entry for modification);
>     mark buffer dirty and release write lock;
> 
> so that the saving of the original page is a separate xlog entry from
> the modification data.  Looks easy, and it'd sure simplify XLogInsert
> a lot.  The only problem is it's wrong.  What if a checkpoint occurs
> between the two XLOG records?
> 
> The decision whether to log the whole buffer has to be atomic with the
> actual entry of the xlog record.  Unless we want to hold the xlog insert
> lock for the entire time that we're (eg) splitting a btree page, that
> means we log the buffer after the modification work is done, not before.

Yes, I see. Can't currently come up with a workaround eighter. Hmm ..
Duplicating the buffer is probably not a workable solution.

I do not however see how the current solution fixes the original problem,
that we don't have a rollback for index modifications.
The index would potentially point to an empty heaptuple slot.
When this slot, because marked empty is reused after startup, the index points 
to the wrong record.
Unless of course startup rollforward visits all heap pages pointed at
by index xlog records and inserts a tuple into heap marked deleted.

Additionally I do not see how this all works for userland index types.

In short I do not think that the current implementation of "physical log" does
what it was intended to do :-(

Andreas


Re: AW: AW: AW: AW: WAL-based allocation of XIDs is insecur e

От
Tom Lane
Дата:
Zeugswetter Andreas SB  <ZeugswetterA@wien.spardat.at> writes:
> I do not however see how the current solution fixes the original problem,
> that we don't have a rollback for index modifications.
> The index would potentially point to an empty heaptuple slot.

How?  There will be an XLOG entry inserting the heap tuple before the
XLOG entry that updates the index.  Rollforward will redo both.  The
heap tuple might not get committed, but it'll be there.

> Additionally I do not see how this all works for userland index types.

None of it works for index types that don't do XLOG entries (which I
think may currently be true for everything except btree :-( ...).  I
don't see how that changes if we alter the way this bit is done.
        regards, tom lane