Re: Reviewing freeze map code

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Reviewing freeze map code
Дата
Msg-id 20160623224224.ubztdvhfx6bwkphu@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Reviewing freeze map code  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Reviewing freeze map code  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
On 2016-06-21 16:32:03 -0400, Robert Haas wrote:
> On Tue, Jun 21, 2016 at 3:46 PM, Andres Freund <andres@anarazel.de> wrote:
> > On 2016-06-21 15:38:25 -0400, Robert Haas wrote:
> >> On Tue, Jun 21, 2016 at 1:49 PM, Andres Freund <andres@anarazel.de> wrote:
> >> >> I'm also a bit dubious that LockAcquire is safe to call in general
> >> >> with interrupts held.
> >> >
> >> > Looks like we could just acquire the tuple-lock *before* doing the
> >> > toast_insert_or_update/RelationGetBufferForTuple, but after releasing
> >> > the buffer lock. That'd allow us to do avoid doing the nested locking,
> >> > should make the recovery just a goto l2;, ...
> >>
> >> Why isn't that racey?  Somebody else can grab the tuple lock after we
> >> release the buffer content lock and before we acquire the tuple lock.
> >
> > Sure, but by the time the tuple lock is released, they'd have updated
> > xmax. So once we acquired that we can just do
> >                 if (xmax_infomask_changed(oldtup.t_data->t_infomask,
> >                                                                   infomask) ||
> >                         !TransactionIdEquals(HeapTupleHeaderGetRawXmax(oldtup.t_data),
> >                                                                  xwait))
> >                         goto l2;
> > which is fine, because we've not yet done the toasting.
> 
> I see.
> 
> > I'm not sure wether this approach is better than deleting potentially
> > toasted data though. It's probably faster, but will likely touch more
> > places in the code, and eat up a infomask bit (infomask & HEAP_MOVED
> > == HEAP_MOVED in my prototype).
> 
> Ugh.  That's not very desirable at all.

I'm looking into three approaches right now:

1) Flag approach from above
2) Undo toasting on concurrent activity, retry
3) Use WAL logging for the already_marked = true case.


1) primarily suffers from a significant amount of complexity. I still
have a bug in there that sometimes triggers "attempted to update
invisible tuple" ERRORs.  Otherwise it seems to perform decently
performancewise - even on workloads with many backends hitting the same
tuple, the retry-rate is low.

2) Seems to work too, but due to the amount of time the tuple is not
locked, the retry rate can be really high. As we perform significant
amount of work (toast insertion & index manipulation or extending a
file) , while the tuple is not locked, it's quite likely that another
session tries to modify the tuple inbetween.  I think it's possible to
essentially livelock.

3) This approach so far seems the best. It's possible to reuse the
xl_heap_lock record (in an afaics backwards compatible manner), and in
most cases the overhead isn't that large.  It's of course annoying to
emit more WAL, but it's not that big an overhead compared to extending a
file, or to toasting.  It's also by far the simplest fix.


Comments?



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tatsuo Ishii
Дата:
Сообщение: Re: Questionabl description in datatype.sgml
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Reviewing freeze map code