Re: Reviewing freeze map code

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Reviewing freeze map code
Дата
Msg-id 20160620193302.s4dbul6japgar5eg@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Reviewing freeze map code  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Reviewing freeze map code  (Robert Haas <robertmhaas@gmail.com>)
Re: Reviewing freeze map code  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On 2016-06-15 08:56:52 -0400, Robert Haas wrote:
> Yikes: heap_update() sets the tuple's XMAX, CMAX, infomask, infomask2,
> and CTID without logging anything or clearing the all-visible flag and
> then releases the lock on the heap page to go do some more work that
> might even ERROR out.  Only if that other work all goes OK do we
> relock the page and perform the WAL-logged actions.
> 
> That doesn't seem like a good idea even in existing releases, because
> you've taken a tuple on an all-visible page and made it not
> all-visible, and you've made a page modification that is not
> necessarily atomic without logging it.

Right, that's broken.


> I'm not sure what to do about this: this part of the heap_update()
> logic has been like this forever, and I assume that if it were easy to
> refactor this away, somebody would have done it by now.

Well, I think generally nobody seriously looked at actually refactoring
heap_update(), even though that'd be a good idea.  But in this instance,
the problem seems relatively fundamental:

We need to lock the origin page, to do visibility checks, etc. Then we
need to figure out the target page. Even disregarding toasting - which
we could be doing earlier with some refactoring - we're going to have to
release the page level lock, to lock them in ascending order. Otherwise
we'll risk kinda likely deadlocks.  We also certainly don't want to nest
the lwlocks for the toast stuff, inside a content lock for the old
tupe's page.

So far the best idea I have - and it's really not a good one - is to
invent a new hint-bit that tells concurrent updates to acquire a
heavyweight tuple lock, while releasing the page-level lock. If that
hint bit does not require any other modifications - and we don't need an
xid in xmax for this use case - that'll avoid doing all the other
`already_marked` stuff early, which should address the correctness
issue.  It's kinda invasive though, and probably has performance
implications.

Does anybody have a better idea?

Regards,

Andres



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: parallel.c is not marked as test covered
Следующее
От: "David G. Johnston"
Дата:
Сообщение: Re: 10.0