Re: sinval synchronization considered harmful

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: sinval synchronization considered harmful
Дата	26 июля 2011 г. 18:05:26
Msg-id	16876.1311714315@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: sinval synchronization considered harmful (Noah Misch <noah@2ndQuadrant.com>)
Ответы	Re: sinval synchronization considered harmful
Список	pgsql-hackers

Дерево обсуждения

Noah Misch <noah@2ndQuadrant.com> writes:
> On Tue, Jul 26, 2011 at 03:40:32PM -0400, Tom Lane wrote:
>> After some further reflection I believe this patch actually is pretty
>> safe, although Noah's explanation of why seems a bit confused.

> Here's the way it can fail:

> 1. Backend enters SIGetDataEntries() with main memory bearing stateP->resetState
> = false, stateP->nextMsgNum = 500, segP->maxMsgNum = 505.  The CPU has those
> latest stateP values in cache, but segP->maxMsgNum is *not* in cache.
> 2. Backend stalls for <long time>.  Meanwhile, other backends issue
> MSGNUMWRAPAROUND - 5 invalidation messages.  Main memory bears
> stateP->resetState = true, stateP->nextMsgNum = 500 - MSGNUMWRAPAROUND,
> segP->maxMsgNum = 500.
> 3. Backend wakes up, uses its cached stateP values and reads segP->maxMsgNum =
> 500 from main memory.  The patch's test finds no need to reset or process
> invalidation messages.

[ squint... ]  Hmm, you're right.  The case where this would break
things is if (some of) the five unprocessed messages relate to some
object we've just locked.  But the initial state you describe would be
valid right after obtaining such a lock.

> That's the theoretical risk I wished to illustrate.  Though this appears
> possible on an abstract x86_64 system, I think it's unrealistic to suppose that
> a dirty cache line could persist *throughout* the issuance of more than 10^9
> invalidation messages on a concrete implementation.

Dirty cache line, maybe not, but what if the assembly code commands the
CPU to load those variables into CPU registers before doing the
comparison?  If they're loaded with maxMsgNum coming in last (or at
least after resetState), I think you can have the problem without any
assumptions about cache line behavior at all.  You just need the process
to lose the CPU at the right time.

If we marked the pointers volatile, we could probably ensure that the
assembly code tests resetState last, but that's not sufficient to avoid
the stale-cache-line risk.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: sinval synchronization considered harmful