Обсуждение: the s_lock_stuck on perform_spin_delay

Поиск

Список

Период

Сортировка

the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

04 января, 09:59:06

Hi,

from src/backend/storage/lmgr/README:

"""
Spinlocks.  These are intended for *very* short-term locks.  If a lock
is to be held more than a few dozen instructions, or across any sort of
kernel call (or even a call to a nontrivial subroutine), don't use a
spinlock. Spinlocks are primarily used as infrastructure for lightweight
locks.
"""

I totally agree with this and IIUC spin lock is usually used with the
following functions.

#define init_local_spin_delay(status) ..
void perform_spin_delay(SpinDelayStatus *status);
void finish_spin_delay(SpinDelayStatus *status);

During the perform_spin_delay, we have the following codes:

void
perform_spin_delay(SpinDelayStatus *status)

    /* Block the process every spins_per_delay tries */
    if (++(status->spins) >= spins_per_delay)
    {
        if (++(status->delays) > NUM_DELAYS)
            s_lock_stuck(status->file, status->line, status->func);

the s_lock_stuck will PAINC the entire system.

My question is if someone doesn't obey the rule by mistake (everyone
can make mistake), shall we PANIC on a production environment? IMO I
think it can be a WARNING on a production environment and be a stuck
when 'ifdef USE_ASSERT_CHECKING'.

People may think spin lock may consume too much CPU, but it is not true
in the discussed scene since perform_spin_delay have pg_usleep in it,
and the MAX_DELAY_USEC is 1 second and MIN_DELAY_USEC is 0.001s.

I notice this issue actually because of the patch "Cache relation
sizes?" from Thomas Munro [1], where the latest patch[2] still have the 
following code. 
+        sr = smgr_alloc_sr();  <-- HERE a spin lock is hold
+
+        /* Upgrade to exclusive lock so we can create a mapping. */
+        LWLockAcquire(mapping_lock, LW_EXCLUSIVE); <-- HERE a complex
  operation is needed. it may take a long time.

Our internal testing system found more misuses on our own PG version.

I think a experienced engineer like Thomas can make this mistake and the
patch was reviewed by 3 peoples, the bug is still there. It is not easy
to say just don't do it. 

the attached code show the prototype in my mind. Any feedback is welcome. 

[1]
https://www.postgresql.org/message-id/CA%2BhUKGJg%2BgqCs0dgo94L%3D1J9pDp5hKkotji9A05k2nhYQhF4%2Bw%40mail.gmail.com
[2]
https://www.postgresql.org/message-id/attachment/123659/v5-0001-WIP-Track-relation-sizes-in-shared-memory.patch  

-- 
Best Regards
Andy Fan

Вложения

v1-0001-Don-t-call-s_lock_stuck-in-production-environment.patch

Re: the s_lock_stuck on perform_spin_delay

От

Matthias van de Meent

Дата:

04 января, 13:23:48

On Thu, 4 Jan 2024 at 08:09, Andy Fan <zhihuifan1213@163.com> wrote:
>
> My question is if someone doesn't obey the rule by mistake (everyone
> can make mistake), shall we PANIC on a production environment? IMO I
> think it can be a WARNING on a production environment and be a stuck
> when 'ifdef USE_ASSERT_CHECKING'.
> [...]
> I think a experienced engineer like Thomas can make this mistake and the
> patch was reviewed by 3 peoples, the bug is still there. It is not easy
> to say just don't do it.
>
> the attached code show the prototype in my mind. Any feedback is welcome.

While I understand your point and could maybe agree with the change
itself (a crash isn't great), I don't think it's an appropriate fix
for the problem of holding a spinlock while waiting for a LwLock, as
spin.h specifically mentions the following (and you quoted the same):

"""
Keep in mind the coding rule that spinlocks must not be held for more
than a few instructions.
"""

I suspect that we'd be better off with stronger protections against
waiting for LwLocks while we hold any spin lock. More specifically,
I'm thinking about something like tracking how many spin locks we
hold, and Assert()-ing that we don't hold any such locks when we start
to wait for an LwLock or run CHECK_FOR_INTERRUPTS-related code (with
potential manual contextual overrides in specific areas of code where
specific care has been taken to make it safe to hold spin locks while
doing those operations - although I consider their existence unlikely
I can't rule them out as I've yet to go through all lock-touching
code). This would probably work in a similar manner as
START_CRIT_SECTION/END_CRIT_SECTION.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

04 января, 16:35:53

On Thu, Jan 4, 2024 at 2:09 AM Andy Fan <zhihuifan1213@163.com> wrote:
> My question is if someone doesn't obey the rule by mistake (everyone
> can make mistake), shall we PANIC on a production environment? IMO I
> think it can be a WARNING on a production environment and be a stuck
> when 'ifdef USE_ASSERT_CHECKING'.
>
> People may think spin lock may consume too much CPU, but it is not true
> in the discussed scene since perform_spin_delay have pg_usleep in it,
> and the MAX_DELAY_USEC is 1 second and MIN_DELAY_USEC is 0.001s.
>
> I notice this issue actually because of the patch "Cache relation
> sizes?" from Thomas Munro [1], where the latest patch[2] still have the
> following code.
> +               sr = smgr_alloc_sr();  <-- HERE a spin lock is hold
> +
> +               /* Upgrade to exclusive lock so we can create a mapping. */
> +               LWLockAcquire(mapping_lock, LW_EXCLUSIVE); <-- HERE a complex
>   operation is needed. it may take a long time.

I'm not sure that the approach this patch takes is correct in detail,
but I kind of agree with you about the overall point. I mean, the idea
of the PANIC is to avoid having the system just sit there in a state
from which it will never recover ... but it can also have the effect
of killing a system that wasn't really dead. I'm not sure what the
best thing to do here is, but it's worth talking about, IMHO.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

04 января, 17:24:50

Hi Matthias and Robert,

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

> On Thu, 4 Jan 2024 at 08:09, Andy Fan <zhihuifan1213@163.com> wrote:
>>
>> My question is if someone doesn't obey the rule by mistake (everyone
>> can make mistake), shall we PANIC on a production environment? IMO I
>> think it can be a WARNING on a production environment and be a stuck
>> when 'ifdef USE_ASSERT_CHECKING'.
>> [...]
>> I think a experienced engineer like Thomas can make this mistake and the
>> patch was reviewed by 3 peoples, the bug is still there. It is not easy
>> to say just don't do it.
>>
>> the attached code show the prototype in my mind. Any feedback is welcome.
>
> While I understand your point and could maybe agree with the change
> itself (a crash isn't great),

It's great that both of you agree that the crash is not great. 

> I don't think it's an appropriate fix
> for the problem of holding a spinlock while waiting for a LwLock, as
> spin.h specifically mentions the following (and you quoted the same):
>
> """
> Keep in mind the coding rule that spinlocks must not be held for more
> than a few instructions.
> """

Yes, I agree that the known [LW]LockAcquire after holding a Spin lock
should be fixed at the first chance rather than pander to it with my
previous patch. My previous patch just take care of the *unknown*
cases (and I cced thomas in the hope that he knows the bug). I also
agree that the special case about [LW]LockAcquire should be detected
more effective as you suggested below. So v2 comes and commit 2 is for
this suggestion. 

>
> I suspect that we'd be better off with stronger protections against
> waiting for LwLocks while we hold any spin lock. More specifically,
> I'm thinking about something like tracking how many spin locks we
> hold, and Assert()-ing that we don't hold any such locks when we start
> to wait for an LwLock or run CHECK_FOR_INTERRUPTS-related code (with
> potential manual contextual overrides in specific areas of code where
> specific care has been taken to make it safe to hold spin locks while
> doing those operations - although I consider their existence unlikely
> I can't rule them out as I've yet to go through all lock-touching
> code). This would probably work in a similar manner as
> START_CRIT_SECTION/END_CRIT_SECTION.
>
> Kind regards,
>
> Matthias van de Meent
> Neon (https://neon.tech)


-- 
Best Regards
Andy Fan

Вложения

Re: the s_lock_stuck on perform_spin_delay

От

Tom Lane

Дата:

04 января, 18:22:15

Robert Haas <robertmhaas@gmail.com> writes:
> I'm not sure that the approach this patch takes is correct in detail,
> but I kind of agree with you about the overall point. I mean, the idea
> of the PANIC is to avoid having the system just sit there in a state
> from which it will never recover ... but it can also have the effect
> of killing a system that wasn't really dead. I'm not sure what the
> best thing to do here is, but it's worth talking about, IMHO.

I'm not a fan of adding overhead to such a performance-critical
thing as spinlocks in order to catch coding errors that are easily
detectable statically.  IMV the correct usage of spinlocks is that
they should only be held across *short, straight line* code segments.
We should be making an effort to ban coding patterns like
"return with spinlock still held", because they're just too prone
to errors similar to this one.  Note that trying to take another
lock is far from the only bad thing that can happen if you're
not very conservative about what code can execute with a spinlock
held.

            regards, tom lane

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

04 января, 19:06:15

On Thu, Jan 4, 2024 at 10:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm not a fan of adding overhead to such a performance-critical
> thing as spinlocks in order to catch coding errors that are easily
> detectable statically.  IMV the correct usage of spinlocks is that
> they should only be held across *short, straight line* code segments.
> We should be making an effort to ban coding patterns like
> "return with spinlock still held", because they're just too prone
> to errors similar to this one.  Note that trying to take another
> lock is far from the only bad thing that can happen if you're
> not very conservative about what code can execute with a spinlock
> held.

I agree that we don't want to add overhead, and also about how
spinlocks should be used, but I dispute "easily detectable
statically." I mean, if you or I look at some code that uses a
spinlock, we'll know whether the pattern that you mention is being
followed or not, modulo differences of opinion in debatable cases. But
you and I cannot be there to look at all the code all the time. If we
had a static checking tool that was run as part of every build, or in
the buildfarm, or by cfbot, or somewhere else that raised the alarm if
this rule was violated, then we could claim to be effectively
enforcing this rule. But with 20-30 active committers and ~100 active
developers at any given point in time, any system that relies on every
relevant person knowing all the rules and remembering to enforce them
on every commit is bound to be less than 100% effective. Some people
won't know what the rule is, some people will decide that their
particular situation is Very Special, some people will just forget to
check for violations, and some people will check for violations but
miss something.

I think the question we should be asking here is what the purpose of
the PANIC is. I can think of two possible purposes. It could be either
(a) an attempt to prevent real-world harm by turning database hangs
into database panics, so that at least the system will restart and get
moving again instead of sitting there stuck for all eternity or (b) an
attempt to punish people for writing bad code by turning coding rule
violations into panics on production systems. If it's (a), that's
defensible, though we can still ask whether it does more harm than
good. If it's (b), that's not a good way of handling that problem,
because (b1) it affects production builds and not just development
builds, (b2) many coding rule violations are vanishingly unlikely to
trigger that PANIC in practice, and (b3) if the PANIC does fire, it
gives you basically zero help in figuring out where the actual problem
is. The PostgreSQL code base is way too big for "ERROR: you screwed
up" to be an acceptable diagnostic.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Tom Lane

Дата:

04 января, 19:33:33

Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Jan 4, 2024 at 10:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> We should be making an effort to ban coding patterns like
>> "return with spinlock still held", because they're just too prone
>> to errors similar to this one.

> I agree that we don't want to add overhead, and also about how
> spinlocks should be used, but I dispute "easily detectable
> statically." I mean, if you or I look at some code that uses a
> spinlock, we'll know whether the pattern that you mention is being
> followed or not, modulo differences of opinion in debatable cases. But
> you and I cannot be there to look at all the code all the time. If we
> had a static checking tool that was run as part of every build, or in
> the buildfarm, or by cfbot, or somewhere else that raised the alarm if
> this rule was violated, then we could claim to be effectively
> enforcing this rule.

I was indeed suggesting that maybe we could find a way to detect
such things automatically.  While I've not been paying close
attention, I recall there's been some discussions of using LLVM/clang
infrastructure for customized static analysis, so maybe it'd be
possible without an undue amount of effort.

> I think the question we should be asking here is what the purpose of
> the PANIC is. I can think of two possible purposes. It could be either
> (a) an attempt to prevent real-world harm by turning database hangs
> into database panics, so that at least the system will restart and get
> moving again instead of sitting there stuck for all eternity or (b) an
> attempt to punish people for writing bad code by turning coding rule
> violations into panics on production systems.

I believe it's (a).  No matter what the reason for a stuck spinlock
is, the only reliable method of getting out of the problem is to
blow things up and start over.  The patch proposed at the top of this
thread would leave the system unable to recover on its own, with the
only recourse being for the DBA to manually force a crash/restart ...
once she figured out that that was necessary, which might take a long
while if the only external evidence is an occasional WARNING that
might not even be making it to the postmaster log.  How's that better?

> ... (b3) if the PANIC does fire, it
> gives you basically zero help in figuring out where the actual problem
> is. The PostgreSQL code base is way too big for "ERROR: you screwed
> up" to be an acceptable diagnostic.

Ideally I agree with the latter, but that doesn't mean that doing
better is easy or even possible.  (The proposed patch certainly does
nothing to help diagnose such issues.)  As for the former point,
panicking here at least offers the chance of getting a stack trace,
which might help a developer find the problem.

            regards, tom lane

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

04 января, 20:04:07

On Thu, Jan 4, 2024 at 11:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I believe it's (a).  No matter what the reason for a stuck spinlock
> is, the only reliable method of getting out of the problem is to
> blow things up and start over.  The patch proposed at the top of this
> thread would leave the system unable to recover on its own, with the
> only recourse being for the DBA to manually force a crash/restart ...
> once she figured out that that was necessary, which might take a long
> while if the only external evidence is an occasional WARNING that
> might not even be making it to the postmaster log.  How's that better?

It's a fair question. I think you're correct if we assume that
everyone's following the coding rule ... at least assuming that the
target system isn't too insanely slow, and I've seen some pretty
crazily overloaded machines. But if the coding rule is not being
followed, then "the only reliable method of getting out of the problem
is to blow things up and start over" becomes a dubious conclusion.

Also, I wonder if many or even all uses of spinlocks uses ought to be
replaced with either LWLocks or atomics. An LWLock might be slightly
slower when contention is low, but it scales better when contention is
high, displays a wait event so that you can see that you have
contention if you do, and doesn't PANIC the system if the contention
gets too bad. And an atomic will just be faster, in the cases where
it's adequate.

The trouble with trying to do a replacement is that some of the
spinlock-using code is ancient and quite hairy. info_lck in particular
looks like a hot mess -- it's used in complex ways and in performance
critical paths, with terrifying comments like this:

 * To read XLogCtl->LogwrtResult, you must hold either info_lck or
 * WALWriteLock.  To update it, you need to hold both locks.  The point of
 * this arrangement is that the value can be examined by code that already
 * holds WALWriteLock without needing to grab info_lck as well.  In addition
 * to the shared variable, each backend has a private copy of LogwrtResult,
 * which is updated when convenient.
 *
 * The request bookkeeping is simpler: there is a shared XLogCtl->LogwrtRqst
 * (protected by info_lck), but we don't need to cache any copies of it.
 *
 * info_lck is only held long enough to read/update the protected variables,
 * so it's a plain spinlock.  The other locks are held longer (potentially
 * over I/O operations), so we use LWLocks for them.  These locks are:

But info_lck was introduced in 1999 and this scheme was introduced in
2012, and a lot has changed since then. Whatever benchmarking was done
to validate this locking regime is probably obsolete at this point.
Back then, LWLocks were built on top of spinlocks, and were, I
believe, a lot slower than they are now. Plus CPU performance
characteristics have changed a lot. So who really knows if the way
we're doing things here makes any sense at all these days? But one
doesn't want to do a naive replacement and pessimize things, either.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andres Freund

Дата:

05 января, 01:54:03

Hi,

On 2024-01-04 12:04:07 -0500, Robert Haas wrote:
> On Thu, Jan 4, 2024 at 11:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I believe it's (a).  No matter what the reason for a stuck spinlock
> > is, the only reliable method of getting out of the problem is to
> > blow things up and start over.  The patch proposed at the top of this
> > thread would leave the system unable to recover on its own, with the
> > only recourse being for the DBA to manually force a crash/restart ...
> > once she figured out that that was necessary, which might take a long
> > while if the only external evidence is an occasional WARNING that
> > might not even be making it to the postmaster log.  How's that better?
>
> It's a fair question. I think you're correct if we assume that
> everyone's following the coding rule ... at least assuming that the
> target system isn't too insanely slow, and I've seen some pretty
> crazily overloaded machines. But if the coding rule is not being
> followed, then "the only reliable method of getting out of the problem
> is to blow things up and start over" becomes a dubious conclusion.

If the coding rule isn't being followed, a crash restart is the least of ones
problems...  But that doesn't mean we shouldn't add infrastructure to make it
easier to detect violations of the spinlock rules - we've had lots of buglets
around this over the years ourselves, so we hardly can expect extension
authors to get this right. Particularly because we don't even document the
rules well, afair.

I think we should add cassert-only infrastructure tracking whether we
currently hold spinlocks, are in a signal handler and perhaps a few other
states. That'd allow us to add assertions like:

- no memory allocations when holding spinlocks or in signal handlers
- no lwlocks while holding spinlocks
- no lwlocks or spinlocks while in signal handlers

> Also, I wonder if many or even all uses of spinlocks uses ought to be
> replaced with either LWLocks or atomics. An LWLock might be slightly
> slower when contention is low, but it scales better when contention is
> high, displays a wait event so that you can see that you have
> contention if you do, and doesn't PANIC the system if the contention
> gets too bad. And an atomic will just be faster, in the cases where
> it's adequate.

I tried to replace all - unfortunately the results were not great. The problem
isn't primarily the lack of spinning (although it might be worth adding that
to lwlocks) or the cost of error recovery, the problem is that a reader-writer
lock are inherently more expensive than simpler locks that don't have multiple
levels.

One example of such increased overhead is that on x86 an lwlock unlock has to
be an atomic operation (to maintain the lock count), whereas as spinlock
unlock can just be a write + compiler barrier. Unfortunately the added atomic
operation turns out to matter in some performance critical cases like the
insertpos_lck.

I think we ought to split lwlocks into reader/writer and simpler mutex. The
simpler mutex still will be slower than s_lock in some relevant cases,
e.g. due to the error recovery handling, but it'd be "local" overhead, rather
than something affecting scalability.

FWIW, these days spinlocks do report a wait event when in perform_spin_delay()
- albeit without detail which lock is being held.

Greetings,

Andres Freund

Re: the s_lock_stuck on perform_spin_delay

От

Andres Freund

Дата:

05 января, 02:06:28

Hi,

On 2024-01-04 14:59:06 +0800, Andy Fan wrote:
> My question is if someone doesn't obey the rule by mistake (everyone
> can make mistake), shall we PANIC on a production environment? IMO I
> think it can be a WARNING on a production environment and be a stuck
> when 'ifdef USE_ASSERT_CHECKING'.
>
> [...]
>
> I notice this issue actually because of the patch "Cache relation
> sizes?" from Thomas Munro [1], where the latest patch[2] still have the
> following code.
> +        sr = smgr_alloc_sr();  <-- HERE a spin lock is hold
> +
> +        /* Upgrade to exclusive lock so we can create a mapping. */
> +        LWLockAcquire(mapping_lock, LW_EXCLUSIVE); <-- HERE a complex
>   operation is needed. it may take a long time.
>
> Our internal testing system found more misuses on our own PG version.

> I think a experienced engineer like Thomas can make this mistake and the
> patch was reviewed by 3 peoples, the bug is still there. It is not easy
> to say just don't do it.

I don't follow this argument - just ignoring the problem, which emitting a
WARNING basically is, doesn't reduce the impact of the bug, it *increases* the
impact, because now the system will not recover from the bug without explicit
operator intervention.  During that time the system might be essentially
unresponsive, because all backends end up contending for some spinlock, which
makes investigating such issues very hard.

I think we should add infrastructure to detect bugs like this during
development, but not PANICing when this happens in production seems completely
non-viable.

Greetings,

Andres Freund

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

05 января, 05:20:39

Hi,

Andres Freund <andres@anarazel.de> writes:

>
> On 2024-01-04 14:59:06 +0800, Andy Fan wrote:
>> My question is if someone doesn't obey the rule by mistake (everyone
>> can make mistake), shall we PANIC on a production environment? IMO I
>> think it can be a WARNING on a production environment and be a stuck
>> when 'ifdef USE_ASSERT_CHECKING'.
>>
>> [...]
>>
>> I notice this issue actually because of the patch "Cache relation
>> sizes?" from Thomas Munro [1], where the latest patch[2] still have the
>> following code.
>> +        sr = smgr_alloc_sr();  <-- HERE a spin lock is hold
>> +
>> +        /* Upgrade to exclusive lock so we can create a mapping. */
>> +        LWLockAcquire(mapping_lock, LW_EXCLUSIVE); <-- HERE a complex
>>   operation is needed. it may take a long time.
>>
>> Our internal testing system found more misuses on our own PG version.
>
>> I think a experienced engineer like Thomas can make this mistake and the
>> patch was reviewed by 3 peoples, the bug is still there. It is not easy
>> to say just don't do it.
>
> I don't follow this argument - just ignoring the problem,

I agree with you but I'm feeling you ignored my post at [1], where I
said for the known issue, it should be fixed at the first chance.

> which emitting a
> WARNING basically is, doesn't reduce the impact of the bug, it *increases* the
> impact, because now the system will not recover from the bug without explicit
> operator intervention.  During that time the system might be essentially
> unresponsive, because all backends end up contending for some spinlock, which
> makes investigating such issues very hard.

Acutally they are doing pg_usleep at the most time.

Besides what Robert said, one more reason to question PANIC is that: PAINC
can't always make the system recovery faster because: a). In the most
system, PANIC makes a core dump which take times and spaces. b). After
the reboot, all the caches like relcache, plancache, fdcache need to be
rebuit. c). Customer needs to handle failure better or else they will be
hurt *more often*.  All of such sense cause slowness as well.

>
> I think we should add infrastructure to detect bugs like this during
> development,

The commit 2 in [1] does something like this. for the details, I missed the
check for memory allocation case as you suggested at [2], but checked
heavyweight lock as well. others should be same IIUC.

> but not PANICing when this happens in production seems completely
> non-viable.
>

Not sure what does *this* exactly means. If it means the bug in Thomas's 
patch, I absoluately agree with you(since it is a known bug and it
should be fixed). If it means the general *unknown* case, it's something
we talked above.

I'm also agree that some LLVM static checker should be pretty good
ideally, it just requires more knowledge base and future maintain
effort. I am willing to have a try shortly. 

[1] https://www.postgresql.org/message-id/871qaxp3ly.fsf%40163.com
[2]
https://www.postgresql.org/message-id/20240104225403.dgmbbfffmm3srpgq%40awork3.anarazel.de

-- 
Best Regards
Andy Fan

Re: the s_lock_stuck on perform_spin_delay

От

Andres Freund

Дата:

05 января, 05:21:07

Hi,

On 2024-01-04 17:03:18 -0600, Jim Nasby wrote:
> On 1/4/24 10:33 AM, Tom Lane wrote:
> 
>     Robert Haas <robertmhaas@gmail.com> writes:
> 
>         On Thu, Jan 4, 2024 at 10:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> 
>             We should be making an effort to ban coding patterns like
>             "return with spinlock still held", because they're just too prone
>             to errors similar to this one.
> 
>         I agree that we don't want to add overhead, and also about how
>         spinlocks should be used, but I dispute "easily detectable
>         statically." I mean, if you or I look at some code that uses a
>         spinlock, we'll know whether the pattern that you mention is being
>         followed or not, modulo differences of opinion in debatable cases. But
>         you and I cannot be there to look at all the code all the time. If we
>         had a static checking tool that was run as part of every build, or in
>         the buildfarm, or by cfbot, or somewhere else that raised the alarm if
>         this rule was violated, then we could claim to be effectively
>         enforcing this rule.
> 
>     I was indeed suggesting that maybe we could find a way to detect
>     such things automatically.  While I've not been paying close
>     attention, I recall there's been some discussions of using LLVM/clang
>     infrastructure for customized static analysis, so maybe it'd be
>     possible without an undue amount of effort.

I played around with this a while back. One reference with a link to a
playground to experiment with attributes:
https://www.postgresql.org/message-id/20200616233105.sm5bvodo6unigno7%40alap3.anarazel.de

Unfortunately clang's thread safety analysis doesn't handle conditionally
acquired locks, which made it far less promising than I initially thought.

I think there might be some other approaches, but they will all suffer from
not understanding "danger" encountered indirectly, via function calls doing
dangerous things. Which we would like to exclude, but I don't think that's
trivial either.

> FWIW, the lackey[1] tool in Valgrind is able to do some kinds of instruction
> counting, so it might be possible to measure how many instructions are actualyl
> being executed while holding a spinlock. Might be easier than code analysis.

I don't think that's particularly promising. Lackey is *slow*. And it requires
actually reaching problematic states. Consider e.g. the case that was reported
upthread, an lwlock acquired within a spinlock protected section - most of the
time that's not going to result in a lot of cycles, because the lwlock is
free.

Greetings,

Andres Freund

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

05 января, 16:51:53

On Thu, Jan 4, 2024 at 6:06 PM Andres Freund <andres@anarazel.de> wrote:
> I think we should add infrastructure to detect bugs like this during
> development, but not PANICing when this happens in production seems completely
> non-viable.

I mean +1 for the infrastructure, but "completely non-viable"? Why?

I've only very rarely seen this PANIC occur, and in the few cases
where I've seen it, it was entirely unclear that the problem was due
to a bug where somebody failed to release a spinlock. It seemed more
likely that the machine was just not really functioning, and the PANIC
was a symptom of processes not getting scheduled rather than a PG bug.
And every time I tell a user that they might need to use a debugger
to, say, set VacuumCostActive = false, or to get a backtrace, or any
other reason, I have to tell them to make sure to detach the debugger
in under 60 seconds, because in the unlikely event that they attach
while the process is holding a spinlock, failure to detach in under 60
seconds will take their production system down for no reason. Now, if
you're about to say that people shouldn't need to use a debugger on
their production instance, I entirely agree ... but in the world I
inhabit, that's often the only way to solve a customer problem, and it
probably will continue to be until we have much better ways of getting
backtraces without using a debugger than is currently the case.

Have you seen real cases where this PANIC prevents a hangup? If yes,
that PANIC traced back to a bug in PostgreSQL? And why didn't the user
just keep hitting the same bug over and PANICing in an endless loop?

I feel like this is one of those things that has just been this way
forever and we don't question it because it's become an article of
faith that it's something we have to have. But I have a very hard time
explaining why it's even a net positive, let alone the unquestionable
good that you seem to think.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andres Freund

Дата:

05 января, 22:11:39

Hi,

On 2024-01-05 08:51:53 -0500, Robert Haas wrote:
> On Thu, Jan 4, 2024 at 6:06 PM Andres Freund <andres@anarazel.de> wrote:
> > I think we should add infrastructure to detect bugs like this during
> > development, but not PANICing when this happens in production seems completely
> > non-viable.
> 
> I mean +1 for the infrastructure, but "completely non-viable"? Why?
> 
> I've only very rarely seen this PANIC occur, and in the few cases
> where I've seen it, it was entirely unclear that the problem was due
> to a bug where somebody failed to release a spinlock.

I see it fairly regularly. Including finding several related bugs that lead to
stuck systems last year (signal handlers are a menace).


> It seemed more likely that the machine was just not really functioning, and
> the PANIC was a symptom of processes not getting scheduled rather than a PG
> bug.

If processes don't get scheduled for that long a crash-restart doesn't seem
that bad anymore :)


> And every time I tell a user that they might need to use a debugger to, say,
> set VacuumCostActive = false, or to get a backtrace, or any other reason, I
> have to tell them to make sure to detach the debugger in under 60 seconds,
> because in the unlikely event that they attach while the process is holding
> a spinlock, failure to detach in under 60 seconds will take their production
> system down for no reason.

Hm - isn't the stuck lock timeout more like 900s (MAX_DELAY_USEC * NUM_DELAYS
= 1000s, but we start at a lower delay)?  One issue with the code as-is is
that interrupted sleeps count towards to the timeout, despite possibly
sleeping much shorter. We should probably fix that, and also report the time
the lock was stuck for in s_lock_stuck().


> Now, if you're about to say that people shouldn't need to use a debugger on
> their production instance, I entirely agree ... but in the world I inhabit,
> that's often the only way to solve a customer problem, and it probably will
> continue to be until we have much better ways of getting backtraces without
> using a debugger than is currently the case.
> 
> Have you seen real cases where this PANIC prevents a hangup? If yes,
> that PANIC traced back to a bug in PostgreSQL? And why didn't the user
> just keep hitting the same bug over and PANICing in an endless loop?

Many, as hinted above. Some bugs in postgres, more bugs in extensions. IME
these bugs aren't hit commonly, so a crash-restart at least allows to hobble
along. The big issue with not crash-restarting is that often the system ends
up inaccessible, which makes it very hard to investigate the issue.


> I feel like this is one of those things that has just been this way
> forever and we don't question it because it's become an article of
> faith that it's something we have to have. But I have a very hard time
> explaining why it's even a net positive, let alone the unquestionable
> good that you seem to think.

I don't think it's an unquestionable good, I just think the alternative of
just endlessly spinning is way worse.

Greetings,

Andres Freund

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

05 января, 22:19:23

On Fri, Jan 5, 2024 at 2:11 PM Andres Freund <andres@anarazel.de> wrote:
> I see it fairly regularly. Including finding several related bugs that lead to
> stuck systems last year (signal handlers are a menace).

In that case, I think this proposal is dead. I can't personally
testify to this code being a force for good, but it sounds like you
can. So be it!

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andres Freund

Дата:

05 января, 22:27:13

Hi,

On 2024-01-05 10:20:39 +0800, Andy Fan wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2024-01-04 14:59:06 +0800, Andy Fan wrote:
> >> My question is if someone doesn't obey the rule by mistake (everyone
> >> can make mistake), shall we PANIC on a production environment? IMO I
> >> think it can be a WARNING on a production environment and be a stuck
> >> when 'ifdef USE_ASSERT_CHECKING'.
> >>
> >> [...]
> >>
> >> I notice this issue actually because of the patch "Cache relation
> >> sizes?" from Thomas Munro [1], where the latest patch[2] still have the
> >> following code.
> >> +        sr = smgr_alloc_sr();  <-- HERE a spin lock is hold
> >> +
> >> +        /* Upgrade to exclusive lock so we can create a mapping. */
> >> +        LWLockAcquire(mapping_lock, LW_EXCLUSIVE); <-- HERE a complex
> >>   operation is needed. it may take a long time.
> >>
> >> Our internal testing system found more misuses on our own PG version.
> >
> >> I think a experienced engineer like Thomas can make this mistake and the
> >> patch was reviewed by 3 peoples, the bug is still there. It is not easy
> >> to say just don't do it.
> >
> > I don't follow this argument - just ignoring the problem,
>
> I agree with you but I'm feeling you ignored my post at [1], where I
> said for the known issue, it should be fixed at the first chance.

With "ignoring the problem" I was referencing emitting a WARNING instead of
crash-restart.

IME stuck spinlocks are caused by issues like not releasing a spinlock,
possibly due to returning early due to an error or such, having lock-nesting
issues leading to deadlocks, acquiring spinlocks or lwlocks in signal
handlers, blocking in signal handlers while holding a spinlock outside of the
signal handers and many variations of those. The common theme of these causes
is that they don't resolve after some time. The only way out of the situation
is to crash-restart, either "automatically" or by operator intervention.

> > which emitting a
> > WARNING basically is, doesn't reduce the impact of the bug, it *increases* the
> > impact, because now the system will not recover from the bug without explicit
> > operator intervention.  During that time the system might be essentially
> > unresponsive, because all backends end up contending for some spinlock, which
> > makes investigating such issues very hard.
>
> Acutally they are doing pg_usleep at the most time.

Sure - but that changes nothing about the problem. The concern isn't CPU
usage, the concern is that there's often no possible forward progress. To take
a recent-ish production issue I looked at, a buggy signal handler lead to a
backend sleeping while holding a spinlock. Soon after the entire system got
stuck, because they also acquired the spinlock. The person initially
investigating the issue at first contacted me because they couldn't even log
into the system, because connection establishment also acquired the spinlock
(I'm not sure anymore which spinlock it was, possibly xlog.c's info_lck?).

> > but not PANICing when this happens in production seems completely
> > non-viable.
> >
>
> Not sure what does *this* exactly means. If it means the bug in Thomas's
> patch, I absoluately agree with you(since it is a known bug and it
> should be fixed). If it means the general *unknown* case, it's something
> we talked above.

I mean that I think that not PANICing anymore would be a seriously bad idea
and cause way more problems than the PANIC.

Greetings,

Andres Freund

Re: the s_lock_stuck on perform_spin_delay

От

Andres Freund

Дата:

05 января, 22:33:18

Hi,

On 2024-01-05 14:19:23 -0500, Robert Haas wrote:
> On Fri, Jan 5, 2024 at 2:11 PM Andres Freund <andres@anarazel.de> wrote:
> > I see it fairly regularly. Including finding several related bugs that lead to
> > stuck systems last year (signal handlers are a menace).
> 
> In that case, I think this proposal is dead. I can't personally
> testify to this code being a force for good, but it sounds like you
> can. So be it!

I think the proposal to make it a WARNING shouldn't go anywhere, but I think
there are several improvements that could come out of this discussion:

- assertion checks against doing dangerous stuff
- compile time help for detecting bad stuff without hitting it at runtime
- make the stuck lock message more informative, e.g. by including the length
  of time the lock was stuck for
- make sure that interrupts can't trigger the stuck lock much quicker, which
  afaict can happen today

Greetings,

Andres Freund

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

07 января, 10:09:24

Hi,


> On 2024-01-05 14:19:23 -0500, Robert Haas wrote:
>> On Fri, Jan 5, 2024 at 2:11 PM Andres Freund <andres@anarazel.de> wrote:
>> > I see it fairly regularly. Including finding several related bugs that lead to
>> > stuck systems last year (signal handlers are a menace).
>>
>> In that case, I think this proposal is dead. I can't personally
>> testify to this code being a force for good, but it sounds like you
>> can. So be it!
>
> I think the proposal to make it a WARNING shouldn't go anywhere,

OK, I give up the WARNING method as well.

> but I think
> there are several improvements that could come out of this discussion:
>
> - assertion checks against doing dangerous stuff
> - make the stuck lock message more informative, e.g. by including the length
>   of time the lock was stuck for

Could you check the attached to see if it is something similar in your
mind?

commit e264da3050285cffd4885637ee97b2326d2f3938 SHOULD **NOT** BE COMMITTED.
Author: yizhi.fzh <yizhi.fzh@alibaba-inc.com>
Date:   Sun Jan 7 15:06:14 2024 +0800

    simple code to prove previously commit works.

commit 80cf987d1abe2cdae195bd5eea520e28142885b4
Author: yizhi.fzh <yizhi.fzh@alibaba-inc.com>
Date:   Thu Jan 4 22:19:50 2024 +0800

    Detect more misuse of spin lock automatically

    spin lock are intended for *very* short-term locks, but it is possible
    to be misused in many cases. e.g. Acquiring another LWLocks or regular
    locks, memory allocation. In this patch, all of such cases will be
    automatically detected in an ASSERT_CHECKING build.

    Signal handle should be avoided when holding a spin lock because it is
    nearly impossible to release the spin lock correctly if that happens.


Luckly after applying the patch, there is no failure when run 'make
check-world'.

> - make sure that interrupts can't trigger the stuck lock much quicker, which
>   afaict can happen today

I can't follow this, do you mind explain more about this a bit?

--
Best Regards
Andy Fan

Вложения

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

08 января, 05:41:01

Hi!

>
> I think we should add cassert-only infrastructure tracking whether we
> currently hold spinlocks, are in a signal handler and perhaps a few other
> states. That'd allow us to add assertions like:
..
> - no lwlocks or ... while in signal handlers

I *wish* lwlocks should *not* be held while in signal handlers since it
inspired me for a direction of a low-frequency internal bug where a
backend acuqire a LWLock when it has acuqired it before. However when I
read more document and code, I am thinking this should not be a
problem.

per: src/backend/storage/lmgr/README

"""
LWLock manager will automatically release held LWLocks during elog()
recovery, so it is safe to raise an error while holding LWLocks.
"""

The code shows us after we acquire a LWLock, such information will be
added into a global variable named held_lwlocks, and whenever we want to
release all the them, we can just call LWLockReleaseAll.
LWLockReleaseAll is called in AbortTransaction, AbortSubTransaction,  
ProcKill, AuxiliaryProcKill and so on. the code is same with what the
README said. So suppose we have some codes like this:

LWLockAcquire(...);
CHECK_FOR_INTERRUPTS();
LWLockRelease();

Even we got ERROR/FATAL in the CHECK_FOR_INTERRUPTS, I think the LWLock
are suppose to be released because of the above statement. Am I missing
anything? 

-- 
Best Regards
Andy Fan

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

08 января, 22:57:34

On Sun, Jan 7, 2024 at 9:52 PM Andy Fan <zhihuifan1213@163.com> wrote:
> > I think we should add cassert-only infrastructure tracking whether we
> > currently hold spinlocks, are in a signal handler and perhaps a few other
> > states. That'd allow us to add assertions like:
> ..
> > - no lwlocks or ... while in signal handlers
>
> I *wish* lwlocks should *not* be held while in signal handlers since it
> inspired me for a direction of a low-frequency internal bug where a
> backend acuqire a LWLock when it has acuqired it before. However when I
> read more document and code, I am thinking this should not be a
> problem.

It's not safe to acquire an LWLock in a signal handler unless we know
that the code that the signal handler is interrupting can't already be
doing that. Consider this code from LWLockAcquire:

        /* Add lock to list of locks held by this backend */
        held_lwlocks[num_held_lwlocks].lock = lock;
        held_lwlocks[num_held_lwlocks++].mode = mode;

Imagine that we're executing this code and we get to the point where
we've set held_lwlocks[num_held_lwlocks].lock = lock and
held_lwlock[num_held_lwlocks].mode = mode, but we haven't yet
incremented num_held_lwlocks. Then a signal arrives and we jump into
the signal handler, which also calls LWLockAcquire(). Hopefully you
can see that the new lock and mode will be written over the old one,
and now held_lwlocks[] is corrupted. Oops.

Because the PostgreSQL code relies extensively on global variables, it
has problems like this all over the place. Another example is the
error-reporting infrastructure. ereport(yadda yadda...) fills out a
bunch of global variables before really going and doing anything. If a
signal handler were to fire and start another ereport(...), chaos
would ensue, for similar reasons as here.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

09 января, 05:01:59

Hi,
Robert Haas <robertmhaas@gmail.com> writes:

> On Sun, Jan 7, 2024 at 9:52 PM Andy Fan <zhihuifan1213@163.com> wrote:
>> > I think we should add cassert-only infrastructure tracking whether we
>> > currently hold spinlocks, are in a signal handler and perhaps a few other
>> > states. That'd allow us to add assertions like:
>> ..
>> > - no lwlocks or ... while in signal handlers
>>
>> I *wish* lwlocks should *not* be held while in signal handlers since it
>> inspired me for a direction of a low-frequency internal bug where a
>> backend acuqire a LWLock when it has acuqired it before. However when I
>> read more document and code, I am thinking this should not be a
>> problem.
>
> It's not safe to acquire an LWLock in a signal handler unless we know
> that the code that the signal handler is interrupting can't already be
> doing that. Consider this code from LWLockAcquire:

Thanks for the explaination! I can follow the sistuation you descirbe
here, then I found I asked a bad question because I didn't clarify what
"signal handlers" I was refering to, sorry about that!

In your statement, I guess you are talking about the signal handler from
Linux. However I *assumed* such handlers are doing pretty similar stuff
like set a 'GlobalVarialbe=true'. If my assumption was right, I think
that should not be take cared. For example:

spin_or_lwlock_acquire();
... (linux signal handler may be invovked here no matther what ... code is)
spin_or_lwlock_relase()

Since the linux signal hander are pretty simply, so it can come back to
'spin_or_lwlock_relase' anyway. (However my assumption may be wrong and
thanks for highlight this, and it is helpful for me to debug my internal
bug!)

The singler handler I was refering to is 'CHECK_FOR_INTERRUPTS', Based
on this, spin_lock and lwlock are acted pretty differently.

spin_lock_acuqire();
CHECK_FOR_INTERRUPT();
spin_lock_release();

Since CHECK_FOR_INTERRUPT usually goes to the ERROR system which makes it
is hard to go back to 'spin_lock_release()', then spin lock leaks! so
CHECK_FOR_INTERRUPT is the place I Assert *spin lock* should not be
handled in my patch. and I understood what Andres was talking about is
the same thing. (Of course I can add the "Assert no spin lock is held"
into every linux single handler as well).

Based on the above, I asked my question in my previous post, where I am
not sure if we should do the same('Assert no-lwlock should be held') for
*lwlock* in CHECK_FOR_INTERRUPT since lwlocks can be released no matter
where CHECK_FOR_INTERRUPT jump to.

--
Best Regards
Andy Fan

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

09 января, 18:44:49

On Mon, Jan 8, 2024 at 9:40 PM Andy Fan <zhihuifan1213@163.com> wrote:
> The singler handler I was refering to is 'CHECK_FOR_INTERRUPTS', Based
> on this, spin_lock and lwlock are acted pretty differently.

CHECK_FOR_INTERRUPTS() is not a signal handler, and it's OK to acquire
and release spin locks or lwlocks there. We have had (and I think
still do have) cases where signal handlers do non-trivial work,
resulting in serious problems in some cases. A bunch of that stuff has
been rewritten to just set a flag and then let the calling code sort
it out, but not everything has been rewritten that way (I think) and
there's always a danger of future hackers introducing new problem
cases.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

10 января, 04:26:50

Hi,

Robert Haas <robertmhaas@gmail.com> writes:

> On Mon, Jan 8, 2024 at 9:40 PM Andy Fan <zhihuifan1213@163.com> wrote:
>> The singler handler I was refering to is 'CHECK_FOR_INTERRUPTS', Based
>> on this, spin_lock and lwlock are acted pretty differently.
>
> CHECK_FOR_INTERRUPTS() is not a signal handler,

hmm, I knew this but .... I think we haven't big difference in mind
actually.

Since all of them agreed that we should do something in infrastructure
to detect some misuse of spin.  I want to know if Andres or you have plan
to do some code review. I don't expect this would happen very soon, just
want to make sure this will not happen that both of you think the other
one will do, but actually none of them does it in fact. a commit fest
[1] has been added for this.

There is a test code show the bad practice which is detected by this
patch in [2]

[1] https://commitfest.postgresql.org/47/4768/
[2] https://www.postgresql.org/message-id/87le91obp7.fsf%40163.com.
--
Best Regards
Andy Fan

Вложения

v3-0001-Detect-more-misuse-of-spin-lock-automatically.patch

Re: the s_lock_stuck on perform_spin_delay

От

Matthias van de Meent

Дата:

10 января, 16:03:28

On Wed, 10 Jan 2024 at 02:44, Andy Fan <zhihuifan1213@163.com> wrote:
> Hi,
>
> I want to know if Andres or you have plan
> to do some code review. I don't expect this would happen very soon, just
> want to make sure this will not happen that both of you think the other
> one will do, but actually none of them does it in fact. a commit fest
> [1] has been added for this.

> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -5419,6 +5419,7 @@ LockBufHdr(BufferDesc *desc)
>         perform_spin_delay(&delayStatus);
>     }
>     finish_spin_delay(&delayStatus);
> +    START_SPIN_LOCK();
>     return old_buf_state | BM_LOCKED;
> }

I think that we need to 'arm' the checks just before we lock the spin
lock, and 'disarm' the checks just after we unlock the spin lock,
rather than after and before, respectively. That way, we won't have a
chance of false negatives: with your current patch it is possible that
an interrupt fires between the acquisition of the lock and the code in
START_SPIN_LOCK() marking the thread as holding a spin lock, which
would cause any check in that signal handler to incorrectly read that
we don't hold any spin locks.

> +++ b/src/backend/storage/lmgr/lock.c
> @@ -776,6 +776,8 @@ LockAcquireExtended(const LOCKTAG *locktag,
>     bool        found_conflict;
>     bool        log_lock = false;
>
> +    Assert(SpinLockCount == 0);
> +

I'm not 100% sure on the policy of this, but theoretically you could
use LockAquireExtended(dontWait=true) while holding a spin lock, as
that would not have an unknown duration. Then again, this function
also does elog/ereport, which would cause issues, still, so this code
may be the better option.

> +    elog(PANIC, "stuck spinlock detected at %s, %s:%d after waiting for %u ms",
> +         func, file, line, delay_ms);

pg_usleep doesn't actually guarantee that we'll wait for exactly that
duration; depending on signals received while spinning and/or OS
scheduling decisions it may be off by orders of magnitude.

> +++ b/src/common/scram-common.c

This is unrelated to the main patchset.

> +++ b/src/include/storage/spin.h

Minor: I think these changes could better be included in miscadmin, or
at least the definition for SpinLockCount should be moved there: The
spin lock system itself shouldn't be needed in places where we need to
make sure that we don't hold any spinlocks, and miscadmin.h already
holds things related to "System interrupt and critical section
handling", which seems quite related.

Kind regards,

Matthias van de Meent

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

10 января, 19:13:10

Thanks for jumping in with a review, Matthias!

On Wed, Jan 10, 2024 at 8:03 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> I'm not 100% sure on the policy of this, but theoretically you could
> use LockAquireExtended(dontWait=true) while holding a spin lock, as
> that would not have an unknown duration. Then again, this function
> also does elog/ereport, which would cause issues, still, so this code
> may be the better option.

This is definitely not allowable, and anybody who is thinking about
doing it should replace the spinlock with an LWLock.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

11 января, 04:55:18

Hi Matthias,

Thanks for the review!

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:

> On Wed, 10 Jan 2024 at 02:44, Andy Fan <zhihuifan1213@163.com> wrote:
>> Hi,
>>
>> I want to know if Andres or you have plan
>> to do some code review. I don't expect this would happen very soon, just
>> want to make sure this will not happen that both of you think the other
>> one will do, but actually none of them does it in fact. a commit fest
>> [1] has been added for this.
>
>
>> +++ b/src/backend/storage/buffer/bufmgr.c
>> @@ -5419,6 +5419,7 @@ LockBufHdr(BufferDesc *desc)
>>         perform_spin_delay(&delayStatus);
>>     }
>>     finish_spin_delay(&delayStatus);
>> +    START_SPIN_LOCK();
>>     return old_buf_state | BM_LOCKED;
>> }
>
> I think that we need to 'arm' the checks just before we lock the spin
> lock, and 'disarm' the checks just after we unlock the spin lock,
> rather than after and before, respectively. That way, we won't have a
> chance of false negatives: with your current patch it is possible that
> an interrupt fires between the acquisition of the lock and the code in
> START_SPIN_LOCK() marking the thread as holding a spin lock, which
> would cause any check in that signal handler to incorrectly read that
> we don't hold any spin locks.

That's a good idea. fixed in v2.

>
>> +++ b/src/backend/storage/lmgr/lock.c
>> @@ -776,6 +776,8 @@ LockAcquireExtended(const LOCKTAG *locktag,
>>     bool        found_conflict;
>>     bool        log_lock = false;
>>
>> +    Assert(SpinLockCount == 0);
>> +
>
> I'm not 100% sure on the policy of this, but theoretically you could
> use LockAquireExtended(dontWait=true) while holding a spin lock, as
> that would not have an unknown duration. Then again, this function
> also does elog/ereport, which would cause issues, still, so this code
> may be the better option.

I thought this statement as "keeping the current patch as it is" since
"not waiting" doesn't means the a few dozen in this case. please
correct me if anything wrong.

>
>> +    elog(PANIC, "stuck spinlock detected at %s, %s:%d after waiting for %u ms",
>> +         func, file, line, delay_ms);
>
> pg_usleep doesn't actually guarantee that we'll wait for exactly that
> duration; depending on signals received while spinning and/or OS
> scheduling decisions it may be off by orders of magnitude.

True, but I did this for two reasons. a). the other soluation needs call
'time' syscall twice, I didn't want to pay this run-time effort. b). the
possiblity of geting a signals during pg_usleep should be low and
even that happens, because the message is just for human, we don't need
a absolutely accurate number. what do you think?

>
>> +++ b/src/common/scram-common.c
>
> This is unrelated to the main patchset

Fixed in v2.

>
>> +++ b/src/include/storage/spin.h
>
> Minor: I think these changes could better be included in miscadmin, or
> at least the definition for SpinLockCount should be moved there: The
> spin lock system itself shouldn't be needed in places where we need to
> make sure that we don't hold any spinlocks, and miscadmin.h already
> holds things related to "System interrupt and critical section
> handling", which seems quite related.

fixed in v2. 

-- 
Best Regards
Andy Fan

Вложения

v2-0001-Detect-more-misuse-of-spin-lock-automatically.patch

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

11 января, 06:17:52

Robert Haas <robertmhaas@gmail.com> writes:

> Thanks for jumping in with a review, Matthias!

FWIW, Matthias is also the first one for this proposal at this
thread, thanks for that as well!

-- 
Best Regards
Andy Fan

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

12 января, 18:45:33

On Wed, Jan 10, 2024 at 10:17 PM Andy Fan <zhihuifan1213@163.com> wrote:
> fixed in v2.

Timing the spinlock wait seems like a separate patch from the new sanity checks.

I suspect that the new sanity checks should only be armed in
assert-enabled builds.

I'm doubtful that this method of tracking the wait time will be
accurate. And I don't know that we can make it accurate with
reasonable overhead. But I don't think we can assume that the time we
tried to wait for and the time that we were actually descheduled are
the same.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

15 января, 08:19:56

Robert Haas <robertmhaas@gmail.com> writes:

> On Wed, Jan 10, 2024 at 10:17 PM Andy Fan <zhihuifan1213@163.com> wrote:
>> fixed in v2.
>
> Timing the spinlock wait seems like a separate patch from the new sanity checks.

Yes, a separate patch would be better, so removed it from v4.

> I suspect that the new sanity checks should only be armed in
> assert-enabled builds.

There are 2 changes in v4. a). Make sure every code is only armed in
assert-enabled builds. Previously there was some counter++ in non
assert-enabled build. b). Record the location of spin lock so that
whenever the Assert failure, we know which spin lock it is. In our
internal testing, that helps a lot.

--
Best Regards
Andy Fan

Вложения

v4-0001-Detect-more-misuse-of-spin-lock-automatically.patch

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

18 января, 15:54:30

Hi Matthias / Robert:

Do you still have interest with making some progress on this topic?

> Robert Haas <robertmhaas@gmail.com> writes:
>
>> On Wed, Jan 10, 2024 at 10:17 PM Andy Fan <zhihuifan1213@163.com> wrote:
>>> fixed in v2.
>>
>> Timing the spinlock wait seems like a separate patch from the new sanity checks.
>
> Yes, a separate patch would be better, so removed it from v4.
>
>> I suspect that the new sanity checks should only be armed in
>> assert-enabled builds.
>
> There are 2 changes in v4. a). Make sure every code is only armed in
> assert-enabled builds. Previously there was some counter++ in non
> assert-enabled build. b). Record the location of spin lock so that
> whenever the Assert failure, we know which spin lock it is. In our
> internal testing, that helps a lot.

v5 attached for fix the linking issue on Windows.

--
Best Regards
Andy Fan

Вложения

v5-0001-Detect-more-misuse-of-spin-lock-automatically.patch

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

18 января, 17:16:39

On Thu, Jan 18, 2024 at 7:56 AM Andy Fan <zhihuifan1213@163.com> wrote:
> Do you still have interest with making some progress on this topic?

Some, but it's definitely not my top priority. I wish I could give as
much attention as everyone would like to everyone's patches, but I
can't even come close.

I think that the stack that you set up in START_SPIN_LOCK() is crazy.
That would only be necessary if it were legal to acquire multiple
spinlocks at the same time, which it definitely isn't. Also, doing
memory allocation and deallocation here appears highly undesirable -
even if we did need to support multiple spinlocks, it would be better
to handle this using something like the approach we already use for
lwlocks, where there is a fixed size array and we blow up if it
overflows.

ASSERT_NO_SPIN_LOCK() looks odd, because I would expect it to be
spelled Assert(!AnySpinLockHeld()). But looking deeper, I see that it
doesn't internally Assert() but rather does something else. Maybe the
name needs some workshopping. SpinLockMustNotBeHeldHere()?
VerifyNoSpinLocksHeld()?

I think we should check that no spinlock is held in a few additional
places: the start of SpinLockAcquire(), and the start of errstart().

You added an #include to dynahash.c despite making no other changes to the file.

I don't know whether the choice to treat buffer header locks as
spinlocks is correct. It seems like it at least deserves a comment,
and possibly some discussion on this mailing list about whether that's
the right idea. I'm not sure that we have all the same restrictions
for buffer header locks as we do for spinlocks in general, but I'm
also not sure that we don't.

On a related note, the patch overall has 0 comments. I don't know that
it needs a lot, but 0 isn't many at all.

miscadmin.h doesn't seem like a good place for this. It's a
widely-included header file and these checks should be needed in
relatively few places; also, they're not really related to most of
what's in that file, IIRC. I also wonder why we're using macros
instead of static inline functions.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

18 января, 20:51:56

Hi Robert,

Thanks for your attention!

> On Thu, Jan 18, 2024 at 7:56 AM Andy Fan <zhihuifan1213@163.com> wrote:
>> Do you still have interest with making some progress on this topic?
>
> Some, but it's definitely not my top priority. I wish I could give as
> much attention as everyone would like to everyone's patches, but I
> can't even come close.

Your point is fair enough.

After reading your comments, I decide to talk more before sending the
next version.

>
> I think that the stack that you set up in START_SPIN_LOCK() is crazy.
> That would only be necessary if it were legal to acquire multiple
> spinlocks at the same time, which it definitely isn't. Also, doing
> memory allocation and deallocation here appears highly undesirable -
> even if we did need to support multiple spinlocks, it would be better
> to handle this using something like the approach we already use for
> lwlocks, where there is a fixed size array and we blow up if it
> overflows.

I wanted to disallow to acquire multiple spinlocks at the same time in
the first version, but later I thought that is beyond of the scope of
this patch. Now I prefer to disallow that. if there is no objection in
the following days, I will do this in next version. After this, we don't
need malloc at all.

>
> ASSERT_NO_SPIN_LOCK() looks odd, because I would expect it to be
> spelled Assert(!AnySpinLockHeld()). But looking deeper, I see that it
> doesn't internally Assert() but rather does something else. Maybe the
> name needs some workshopping. SpinLockMustNotBeHeldHere()?
> VerifyNoSpinLocksHeld()?

Yes, it is not a Assert since I want to provide more information about
where the SpinLock was held. Assert doesn't have such capacity but
elog(PANIC, ...) can put more information before the PANIC.

VerifyNoSpinLocksHeld looks much more professional than
ASSERT_NO_SPIN_LOCK; I will use this in the next version.

> I think we should check that no spinlock is held in a few additional
> places: the start of SpinLockAcquire(), and the start of errstart().

Agreed.

> You added an #include to dynahash.c despite making no other changes to
> the file.

That's mainly because I put the code into miscadmin.h and spin.h depends
on miscadmin.h with MACROs.

>
> I don't know whether the choice to treat buffer header locks as
> spinlocks is correct. It seems like it at least deserves a comment,
> and possibly some discussion on this mailing list about whether that's
> the right idea. I'm not sure that we have all the same restrictions
> for buffer header locks as we do for spinlocks in general, but I'm
> also not sure that we don't.

The LockBufHdr also used init_local_spin_delay / perform_spin_delay
infrastruce and then it has the same issue like ${subject}, it is pretty
like the code in s_lock; Based on my current knowledge, I think we
should add the check there.

>
> On a related note, the patch overall has 0 comments. I don't know that
> it needs a lot, but 0 isn't many at all.

hmm, I tried to write a good commit message, but comments do need some
improvement, thanks for highlighting this!

>
> miscadmin.h doesn't seem like a good place for this. It's a
> widely-included header file and these checks should be needed in
> relatively few places; also, they're not really related to most of
> what's in that file, IIRC.

they were put into spin.h in v1 but later move to miscadmin.h at [1].

> I also wonder why we're using macros instead of static inline
> functions.

START_SPIN_LOCK need to be macro since it use __FILE__ and __LINE__ to
note where the SpinLock is held. for others, just for consistent
purpose. I think they can be changed to inline function, at least for
VerifyNoSpinLocksHeld.

[1]
https://www.postgresql.org/message-id/CAEze2WggP-2Dhocmdhp-LxBzic%3DMXRgGA_tmv1G_9n-PDt2MQg%40mail.gmail.com

--
Best Regards
Andy Fan

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

18 января, 22:00:58

On Thu, Jan 18, 2024 at 1:30 PM Andy Fan <zhihuifan1213@163.com> wrote:
> > You added an #include to dynahash.c despite making no other changes to
> > the file.
>
> That's mainly because I put the code into miscadmin.h and spin.h depends
> on miscadmin.h with MACROs.

That's not a good reason. Headers need to include headers on which
they depend; a .c file shouldn't be required to include one header
because it includes another.

> The LockBufHdr also used init_local_spin_delay / perform_spin_delay
> infrastruce and then it has the same issue like ${subject}, it is pretty
> like the code in s_lock; Based on my current knowledge, I think we
> should add the check there.

I'd like to hear from Andres, if possible. @Andres: Should these
sanity checks apply only to spin locks per se, or also to buffer
header locks?

> they were put into spin.h in v1 but later move to miscadmin.h at [1].
> [1]
> https://www.postgresql.org/message-id/CAEze2WggP-2Dhocmdhp-LxBzic%3DMXRgGA_tmv1G_9n-PDt2MQg%40mail.gmail.com

I'm not entirely sure what the right thing to do is here, and the
answer may depend on the previous question. But I disagree with
Matthias -- I don't think miscadmin.h can be the right answer
regardless.

> START_SPIN_LOCK need to be macro since it use __FILE__ and __LINE__ to
> note where the SpinLock is held. for others, just for consistent
> purpose. I think they can be changed to inline function, at least for
> VerifyNoSpinLocksHeld.

Good point about __FILE__ and __LINE__.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

19 января, 05:51:43

Hi,

Here is the summary of the open-items, it would be great that Andres and
Matthias have a look at this when they have time.

1. Shall we treat the LockBufHdr as a SpinLock usage.

>> The LockBufHdr also used init_local_spin_delay / perform_spin_delay
>> infrastruce and then it has the same issue like ${subject}, it is pretty
>> like the code in s_lock; Based on my current knowledge, I think we
>> should add the check there.
>
> I'd like to hear from Andres, if possible. @Andres: Should these
> sanity checks apply only to spin locks per se, or also to buffer
> header locks?
>

2. Where shall we put the START/END_SPIN_LOCK() into? Personally I'd
like spin.h. One of the reasons is 'misc' usually makes me think they
are something not well categoried, and hence many different stuffs are
put together. 

>> they were put into spin.h in v1 but later move to miscadmin.h at [1].
>> [1]
>> https://www.postgresql.org/message-id/CAEze2WggP-2Dhocmdhp-LxBzic%3DMXRgGA_tmv1G_9n-PDt2MQg%40mail.gmail.com
>
> I'm not entirely sure what the right thing to do is here, and the
> answer may depend on the previous question. But I disagree with
> Matthias -- I don't think miscadmin.h can be the right answer
> regardless.
>

-- 
Best Regards
Andy Fan

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

19 января, 09:17:13

Hi,

v6 attached which addressed all the items Robert suggested except the
following 2 open items. They are handled differently.

>
> Here is the summary of the open-items, it would be great that Andres and
> Matthias have a look at this when they have time.
>
>>> The LockBufHdr also used init_local_spin_delay / perform_spin_delay
>>> infrastruce and then it has the same issue like ${subject}, it is pretty
>>> like the code in s_lock; Based on my current knowledge, I think we
>>> should add the check there.
>>
>> I'd like to hear from Andres, if possible. @Andres: Should these
>> sanity checks apply only to spin locks per se, or also to buffer
>> header locks?

v6 is splitted into 2 commits, one for normal SpinLock and one for
LockBufHdr lock.

commit 6276d2f66b0760053e3fdfe259971be3abba3c63
Author: yizhi.fzh <yizhi.fzh@alibaba-inc.com>
Date:   Fri Jan 19 13:52:07 2024 +0800

    Detect more misuse of spin lock automatically
    
    Spin lock are intended for *very* short-term locks, but it is possible
    to be misused in many cases. e.g. Acquiring another LWLocks or regular
    locks, memory allocation, errstart when holding a spin lock. this patch
    would detect such misuse automatically in a USE_ASSERT_CHECKING build.
    
    CHECK_FOR_INTERRUPTS should be avoided as well when holding a spin lock.
    Depends on what signals are left to handle, PG may raise error/fatal
    which would cause the code jump to some other places which is hardly to
    release the spin lock anyway.

commit 590a0c6f767f62f6c83289d55de99973bc7da417 (HEAD -> s_stuck_v3)
Author: yizhi.fzh <yizhi.fzh@alibaba-inc.com>
Date:   Fri Jan 19 13:57:46 2024 +0800

    Treat (un)LockBufHdr as a SpinLock.
    
    The LockBufHdr also used init_local_spin_delay / perform_spin_delay
    infrastructure and so it is also possible that PANIC the system
    when it can't be acquired in a short time, and its code is pretty
    similar with s_lock. so treat it same as SPIN lock when regarding to
    misuse of spinlock detection.

>>> they were put into spin.h in v1 but later move to miscadmin.h at [1].
>>> [1]
>>> https://www.postgresql.org/message-id/CAEze2WggP-2Dhocmdhp-LxBzic%3DMXRgGA_tmv1G_9n-PDt2MQg%40mail.gmail.com
>>
>> I'm not entirely sure what the right thing to do is here, and the
>> answer may depend on the previous question. But I disagree with
>> Matthias -- I don't think miscadmin.h can be the right answer
>> regardless.

I put it into spin.h this time in commit 1, and include the extern
function VerifyNoSpinLocksHeld in spin.c into miscadmin.h like what we
did for ProcessInterrupts. This will easy the miscadmin dependency. the
changes for '#include xxx' looks better than before.

-- 
Best Regards
Andy Fan

Вложения

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

22 января, 02:26:04

Hi,

Andy Fan <zhihuifan1213@163.com> writes:

> Hi,
>
> v6 attached which addressed all the items Robert suggested except the
> following 2 open items. They are handled differently.
>
>>
>> Here is the summary of the open-items, it would be great that Andres and
>> Matthias have a look at this when they have time.
>>
>>>> The LockBufHdr also used init_local_spin_delay / perform_spin_delay
>>>> infrastruce and then it has the same issue like ${subject}, it is pretty
>>>> like the code in s_lock; Based on my current knowledge, I think we
>>>> should add the check there.
>>>
>>> I'd like to hear from Andres, if possible. @Andres: Should these
>>> sanity checks apply only to spin locks per se, or also to buffer
>>> header locks?
>
> v6 is splitted into 2 commits, one for normal SpinLock and one for
> LockBufHdr lock.
>
> commit 6276d2f66b0760053e3fdfe259971be3abba3c63
> Author: yizhi.fzh <yizhi.fzh@alibaba-inc.com>
> Date:   Fri Jan 19 13:52:07 2024 +0800
>
>     Detect more misuse of spin lock automatically
>     
>     Spin lock are intended for *very* short-term locks, but it is possible
>     to be misused in many cases. e.g. Acquiring another LWLocks or regular
>     locks, memory allocation, errstart when holding a spin lock. this patch
>     would detect such misuse automatically in a USE_ASSERT_CHECKING build.
>     
>     CHECK_FOR_INTERRUPTS should be avoided as well when holding a spin lock.
>     Depends on what signals are left to handle, PG may raise error/fatal
>     which would cause the code jump to some other places which is hardly to
>     release the spin lock anyway.
>
> commit 590a0c6f767f62f6c83289d55de99973bc7da417 (HEAD -> s_stuck_v3)
> Author: yizhi.fzh <yizhi.fzh@alibaba-inc.com>
> Date:   Fri Jan 19 13:57:46 2024 +0800
>
>     Treat (un)LockBufHdr as a SpinLock.
>     
>     The LockBufHdr also used init_local_spin_delay / perform_spin_delay
>     infrastructure and so it is also possible that PANIC the system
>     when it can't be acquired in a short time, and its code is pretty
>     similar with s_lock. so treat it same as SPIN lock when regarding to
>     misuse of spinlock detection.
>
>>>> they were put into spin.h in v1 but later move to miscadmin.h at [1].
>>>> [1]
>>>> https://www.postgresql.org/message-id/CAEze2WggP-2Dhocmdhp-LxBzic%3DMXRgGA_tmv1G_9n-PDt2MQg%40mail.gmail.com
>>>
>>> I'm not entirely sure what the right thing to do is here, and the
>>> answer may depend on the previous question. But I disagree with
>>> Matthias -- I don't think miscadmin.h can be the right answer
>>> regardless.
>
> I put it into spin.h this time in commit 1, and include the extern
> function VerifyNoSpinLocksHeld in spin.c into miscadmin.h like what we
> did for ProcessInterrupts. This will easy the miscadmin dependency. the
> changes for '#include xxx' looks better than before.

I found a speical case about checking it in errstart. So commit 3 in v7
is added.  I'm not intent to commit 1 and commit 3 should be 2 sperate
commits, but making them 2 will be easy for discussion.

commit 757c67c1d4895ce6a523bcf5217af8eb2351e2a1 (HEAD -> s_stuck_v3)
Author: yizhi.fzh <yizhi.fzh@alibaba-inc.com>
Date:   Mon Jan 22 07:14:29 2024 +0800

    Bypass SpinLock checking in SIGQUIT signal hander
    
    When a process receives a SIGQUIT signal, it indicates the system has a
    crash time. It's possible that the process is just holding a Spin
    lock. By our current checking, this process will PANIC with a misuse of
    spinlock which is pretty prone to misunderstanding. so we need to bypass
    the spin lock holding checking in this case. It is safe since the
    overall system will be restarted.

-- 
Best Regards
Andy Fan

Вложения

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

22 января, 10:18:35

Andy Fan <zhihuifan1213@163.com> writes:

> I found a speical case about checking it in errstart. So commit 3 in v7
> is added. 
>
> commit 757c67c1d4895ce6a523bcf5217af8eb2351e2a1 (HEAD -> s_stuck_v3)
> Author: yizhi.fzh <yizhi.fzh@alibaba-inc.com>
> Date:   Mon Jan 22 07:14:29 2024 +0800
>
>     Bypass SpinLock checking in SIGQUIT signal hander
>     

I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
quickdie, however this is bad not only because it doesn't work on
Windows, but also it has too poor performance even it impacts on
USE_ASSERT_CHECKING build only. In v8, I introduced a new global
variable quickDieInProgress to handle this.

-- 
Best Regards
Andy Fan

Вложения

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

22 января, 18:35:53

On Mon, Jan 22, 2024 at 2:22 AM Andy Fan <zhihuifan1213@163.com> wrote:
> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
> quickdie, however this is bad not only because it doesn't work on
> Windows, but also it has too poor performance even it impacts on
> USE_ASSERT_CHECKING build only. In v8, I introduced a new global
> variable quickDieInProgress to handle this.

OK, I like the split between 0001 and 0002. I still think 0001 has
cosmetic problems, but if some committer wants to take it forward,
they can decide what to do about that; you and I going back and forth
doesn't seem like the right approach to sorting that out. Whether or
not 0002 is adopted might affect what we do about the cosmetics in
0001, too.

0003 seems ... unfortunate. It seems like an admission that 0001 is
wrong. Surely it *isn't* right to ignore the spinlock restrictions in
quickdie() in general. For example, we could self-deadlock if we try
to acquire a spinlock we already hold. If the problem here is merely
the call in errstart() then maybe we need to rethink that particular
call. If it goes any deeper than that, maybe we've got actual bugs we
need to fix.

+        * It's likely to check the BlockSig to know if it is doing a quickdie
+        * with sigismember, but it is too expensive in test, so introduce
+        * quickDieInProgress to avoid that.

This isn't very good English -- I realize that can sometimes be hard
-- but also -- I don't think it likely that a future hacker would
wonder why this isn't done that way. A static variable is normal for
PostgreSQL; checking the signal mask would be a completely novel
approach. So I think this comment is missing the mark topically. If
this patch is right at all, the comment here should focus on why
disabling these checks in quickdie() is necessary and appropriate, not
why it's coded to match everything else in the system.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

22 января, 19:16:10

Robert Haas <robertmhaas@gmail.com> writes:

> On Mon, Jan 22, 2024 at 2:22 AM Andy Fan <zhihuifan1213@163.com> wrote:
>> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
>> quickdie, however this is bad not only because it doesn't work on
>> Windows, but also it has too poor performance even it impacts on
>> USE_ASSERT_CHECKING build only. In v8, I introduced a new global
>> variable quickDieInProgress to handle this.
>
> OK, I like the split between 0001 and 0002. I still think 0001 has
> cosmetic problems, but if some committer wants to take it forward,
> they can decide what to do about that; you and I going back and forth
> doesn't seem like the right approach to sorting that out. Whether or
> not 0002 is adopted might affect what we do about the cosmetics in
> 0001, too.

Replacing ASSERT_NO_SPIN_LOCK with VerifyNoSpinLocksHeld or adding more
comments to each call of VerifyNoSpinLocksHeld really makes me happy
since they make things more cosmetic and practical. So I'd be absolutely
willing to do more stuff like this. Thanks for such suggestions!

Then I can't understand the left cosmetic problems... since you are
saying it may related to 0002, I guess you are talking about the naming
of START_SPIN_LOCK and END_SPIN_LOCK?

>
> 0003 seems ... unfortunate. It seems like an admission that 0001 is
> wrong.

Yes, that's what I was thinking. I doubted if I should merge 0003 to
0001 directly during this discussion, and finally I made it separate for
easier dicussion.

> Surely it *isn't* right to ignore the spinlock restrictions in
> quickdie() in general. For example, we could self-deadlock if we try
> to acquire a spinlock we already hold. If the problem here is merely
> the call in errstart() then maybe we need to rethink that particular
> call. If it goes any deeper than that, maybe we've got actual bugs we
> need to fix.

I get your point! Acquiring an already held spinlock in quickdie is
unlikely to happen, but since our existing infrastructure can handle it,
then there is no reason to bypass it. Since the problem here is just
errstart, we can do a if(!quickDieInProgress) VerifyNoSpinLocksHeld();
in errstart only. Another place besides the errstart is the
CHECK_FOR_INTERRUPTS in errfinish. I think we can add the same check for
the VerifyNoSpinLocksHeld in CHECK_FOR_INTERRUPTS.

>
> +        * It's likely to check the BlockSig to know if it is doing a quickdie
> +        * with sigismember, but it is too expensive in test, so introduce
> +        * quickDieInProgress to avoid that.
>
> This isn't very good English -- I realize that can sometimes be hard
> -- but also -- I don't think it likely that a future hacker would
> wonder why this isn't done that way. A static variable is normal for
> PostgreSQL; checking the signal mask would be a completely novel
> approach. So I think this comment is missing the mark topically.

I was wrong to think sigismember is a syscall, but now I see it is just
a function in glibc. Even I can't get the source code of it, I think it
should just be some bit-operation based on the definition of __sigset_t.

Another badness of sigismember is it is not avaiable in windows. It is
still unclear to me why sigaddset in quickdie can
compile in windows.  (I have a sigismember version and get a linker
error at windows). This should be a blocker for me to submit the next
version of patch.

> If
> this patch is right at all, the comment here should focus on why
> disabling these checks in quickdie() is necessary and appropriate, not
> why it's coded to match everything else in the system.

I agree, and I think the patch 0003 is not right at all:(

--
Best Regards
Andy Fan

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

22 января, 20:01:29

On Mon, Jan 22, 2024 at 11:58 AM Andy Fan <zhihuifan1213@163.com> wrote:
> I get your point! Acquiring an already held spinlock in quickdie is
> unlikely to happen, but since our existing infrastructure can handle it,
> then there is no reason to bypass it.

No, the existing infrastructure cannot handle that at all.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

22 января, 20:10:17

Robert Haas <robertmhaas@gmail.com> writes:

> On Mon, Jan 22, 2024 at 11:58 AM Andy Fan <zhihuifan1213@163.com> wrote:
>> I get your point! Acquiring an already held spinlock in quickdie is
>> unlikely to happen, but since our existing infrastructure can handle it,
>> then there is no reason to bypass it.
>
> No, the existing infrastructure cannot handle that at all.

Actually I mean we can handle it without 0003. am I still wrong?
Without the 0003, if we acquiring the spin lock which is held by
ourself already. VerifyNoSpinLocksHeld in SpinLockAcquire should catch
it.

--
Best Regards
Andy Fan

Re: the s_lock_stuck on perform_spin_delay

От

Robert Haas

Дата:

22 января, 20:25:14

On Mon, Jan 22, 2024 at 12:13 PM Andy Fan <zhihuifan1213@163.com> wrote:
> > On Mon, Jan 22, 2024 at 11:58 AM Andy Fan <zhihuifan1213@163.com> wrote:
> >> I get your point! Acquiring an already held spinlock in quickdie is
> >> unlikely to happen, but since our existing infrastructure can handle it,
> >> then there is no reason to bypass it.
> >
> > No, the existing infrastructure cannot handle that at all.
>
> Actually I mean we can handle it without 0003. am I still wrong?
> Without the 0003, if we acquiring the spin lock which is held by
> ourself already. VerifyNoSpinLocksHeld in SpinLockAcquire should catch
> it.

But that's only going to run in assert-only builds. The whole point of
the patch set is to tell developers that there are bugs in the code
that need fixing, not to catch problems that actually occur in
production.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

22 января, 20:31:59

Robert Haas <robertmhaas@gmail.com> writes:

> On Mon, Jan 22, 2024 at 12:13 PM Andy Fan <zhihuifan1213@163.com> wrote:
>> > On Mon, Jan 22, 2024 at 11:58 AM Andy Fan <zhihuifan1213@163.com> wrote:
>> >> I get your point! Acquiring an already held spinlock in quickdie is
>> >> unlikely to happen, but since our existing infrastructure can handle it,
>> >> then there is no reason to bypass it.
>> >
>> > No, the existing infrastructure cannot handle that at all.
>>
>> Actually I mean we can handle it without 0003. am I still wrong?
>> Without the 0003, if we acquiring the spin lock which is held by
>> ourself already. VerifyNoSpinLocksHeld in SpinLockAcquire should catch
>> it.
>
> But that's only going to run in assert-only builds. The whole point of
> the patch set is to tell developers that there are bugs in the code
> that need fixing, not to catch problems that actually occur in
> production.

I see. As to this aspect, then yes.

--
Best Regards
Andy Fan

Re: the s_lock_stuck on perform_spin_delay

От

Andres Freund

Дата:

23 января, 23:07:04

Hi,

On 2024-01-18 14:00:58 -0500, Robert Haas wrote:
> > The LockBufHdr also used init_local_spin_delay / perform_spin_delay
> > infrastruce and then it has the same issue like ${subject}, it is pretty
> > like the code in s_lock; Based on my current knowledge, I think we
> > should add the check there.
> 
> I'd like to hear from Andres, if possible. @Andres: Should these
> sanity checks apply only to spin locks per se, or also to buffer
> header locks?

They also should apply to buffer header locks. The exact same dangers apply
there. The only reason this isn't using a plain spinlock is that this way we
can modify more state with a single atomic operation. But all the dangers of
using spinlocks apply just as well.

Greetings,

Andres Freund

Re: the s_lock_stuck on perform_spin_delay

От

Andres Freund

Дата:

23 января, 23:15:12

Hi,

On 2024-01-22 15:18:35 +0800, Andy Fan wrote:
> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
> quickdie, however this is bad not only because it doesn't work on
> Windows, but also it has too poor performance even it impacts on
> USE_ASSERT_CHECKING build only. In v8, I introduced a new global
> variable quickDieInProgress to handle this.

For reasons you already noted, using sigismember() isn't viable. But I am not
convinced by quickDieInProgress either. I think we could just reset the state
for the spinlock check in the code handling PANIC, perhaps via a helper
function in spin.c.


> +void
> +VerifyNoSpinLocksHeld(void)
> +{
> +#ifdef USE_ASSERT_CHECKING
> +    if (last_spin_lock_file != NULL)
> +        elog(PANIC, "A spin lock has been held at %s:%d",
> +             last_spin_lock_file, last_spin_lock_lineno);
> +#endif
> +}

I think the #ifdef for this needs to be in the header, not here. Otherwise we
add a pointless external function call to a bunch of performance critical
code.


> From f09518df76572adca85cba5008ea0cae5074603a Mon Sep 17 00:00:00 2001
> From: "yizhi.fzh" <yizhi.fzh@alibaba-inc.com>
> Date: Fri, 19 Jan 2024 13:57:46 +0800
> Subject: [PATCH v8 2/3] Treat (un)LockBufHdr as a SpinLock.
> 
> The LockBufHdr also used init_local_spin_delay / perform_spin_delay
> infrastructure and so it is also possible that PANIC the system
> when it can't be acquired in a short time, and its code is pretty
> similar with s_lock. so treat it same as SPIN lock when regarding to
> misuse of spinlock detection.
> ---
>  src/backend/storage/buffer/bufmgr.c | 1 +
>  src/include/storage/buf_internals.h | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
> index 7d601bef6d..c600a113cf 100644
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -5409,6 +5409,7 @@ LockBufHdr(BufferDesc *desc)
>  
>      init_local_spin_delay(&delayStatus);
>  
> +    START_SPIN_LOCK();
>      while (true)
>      {
>          /* set BM_LOCKED flag */

Seems pretty odd that we now need init_local_spin_delay() and
START_SPIN_LOCK(). Note that init_local_spin_delay() also wraps handling of
__FILE__, __LINE__ etc, so it seems we're duplicating state here.


Greetings,

Andres Freund

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

25 января, 10:24:17

Hi, 

> Hi,
>
> On 2024-01-22 15:18:35 +0800, Andy Fan wrote:
>> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
>> quickdie, however this is bad not only because it doesn't work on
>> Windows, but also it has too poor performance even it impacts on
>> USE_ASSERT_CHECKING build only. In v8, I introduced a new global
>> variable quickDieInProgress to handle this.
>
> For reasons you already noted, using sigismember() isn't viable. But I am not
> convinced by quickDieInProgress either. I think we could just reset the state
> for the spinlock check in the code handling PANIC, perhaps via a helper
> function in spin.c.

Handled with the action for your suggestion #3. 

>> +void
>> +VerifyNoSpinLocksHeld(void)
>> +{
>> +#ifdef USE_ASSERT_CHECKING
>> +    if (last_spin_lock_file != NULL)
>> +        elog(PANIC, "A spin lock has been held at %s:%d",
>> +             last_spin_lock_file, last_spin_lock_lineno);
>> +#endif
>> +}
>
> I think the #ifdef for this needs to be in the header, not here. Otherwise we
> add a pointless external function call to a bunch of performance critical
> code.
>
Done.

>> diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
>> index 7d601bef6d..c600a113cf 100644
>> --- a/src/backend/storage/buffer/bufmgr.c
>> +++ b/src/backend/storage/buffer/bufmgr.c
>> @@ -5409,6 +5409,7 @@ LockBufHdr(BufferDesc *desc)
>>  
>>      init_local_spin_delay(&delayStatus);
>>  
>> +    START_SPIN_LOCK();
>>      while (true)
>>      {
>>          /* set BM_LOCKED flag */
>
> Seems pretty odd that we now need init_local_spin_delay() and
> START_SPIN_LOCK(). Note that init_local_spin_delay() also wraps handling of
> __FILE__, __LINE__ etc, so it seems we're duplicating state here.

Thanks for catching this! Based on the feedbacks so far, it is not OK to
acquire another spin lock when holding one already. So I refactor the
code like this:

 /*
- * Support for spin delay which is useful in various places where
- * spinlock-like procedures take place.
+ * Support for spin delay and spin misuse detection purpose.
+ *
+ * spin delay which is useful in various places where spinlock-like
+ * procedures take place.
+ *
+ * spin misuse is based on global spinStatus to know if a spin lock
+ * is held when a heavy operation is taking.
  */
 typedef struct
 {
@@ -846,22 +854,40 @@ typedef struct
     const char *file;
     int            line;
     const char *func;
-} SpinDelayStatus;
+    bool        in_panic; /* works for spin lock misuse purpose. */
+} SpinLockStatus;

+extern PGDLLIMPORT SpinLockStatus spinStatus;

Now all the init_local_spin_delay, perform_spin_delay, finish_spin_delay
access the same global variable spinStatus. and just 2 extra functions
added (previously have 3). they are:

extern void VerifyNoSpinLocksHeld(bool check_in_panic);
extern void ResetSpinLockStatus(void);

The panic check stuff is still added into spinLockStatus. 

v9 attached.

-- 
Best Regards
Andy Fan

Вложения

v9-0001-Detect-misuse-of-spin-lock-automatically.patch

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

25 января, 10:36:41

Andres Freund <andres@anarazel.de> writes:

> On 2024-01-18 14:00:58 -0500, Robert Haas wrote:
>> > The LockBufHdr also used init_local_spin_delay / perform_spin_delay
>> > infrastruce and then it has the same issue like ${subject}, it is pretty
>> > like the code in s_lock; Based on my current knowledge, I think we
>> > should add the check there.
>> 
>> I'd like to hear from Andres, if possible. @Andres: Should these
>> sanity checks apply only to spin locks per se, or also to buffer
>> header locks?
>
> They also should apply to buffer header locks. The exact same dangers apply
> there. The only reason this isn't using a plain spinlock is that this way we
> can modify more state with a single atomic operation. But all the dangers of
> using spinlocks apply just as well.

Thanks for speaking on this! 

-- 
Best Regards
Andy Fan

Re: the s_lock_stuck on perform_spin_delay

От

Andres Freund

Дата:

07 февраля, 22:15:57

Hi,

There are some similarities between this and
https://www.postgresql.org/message-id/20240207184050.rkvpuudq7huijmkb%40awork3.anarazel.de
as described in that email.


On 2024-01-25 15:24:17 +0800, Andy Fan wrote:
> From 4c8fd0ab71299e57fbeb18dec70051bd1d035c7c Mon Sep 17 00:00:00 2001
> From: "yizhi.fzh" <yizhi.fzh@alibaba-inc.com>
> Date: Thu, 25 Jan 2024 15:19:49 +0800
> Subject: [PATCH v9 1/1] Detect misuse of spin lock automatically
>
> Spin lock are intended for very short-term locks, but it is possible
> to be misused in many cases. e.g. Acquiring another LWLocks or regular
> locks, memory allocation, errstart when holding a spin lock. this patch
> would detect such misuse automatically in a USE_ASSERT_CHECKING build.

> CHECK_FOR_INTERRUPTS should be avoided as well when holding a spin lock.
> Depends on what signals are left to handle, PG may raise error/fatal
> which would cause the code jump to some other places which is hardly to
> release the spin lock anyway.
> ---
>  src/backend/storage/buffer/bufmgr.c | 24 +++++++----
>  src/backend/storage/lmgr/lock.c     |  6 +++
>  src/backend/storage/lmgr/lwlock.c   | 21 +++++++---
>  src/backend/storage/lmgr/s_lock.c   | 63 ++++++++++++++++++++---------
>  src/backend/tcop/postgres.c         |  6 +++
>  src/backend/utils/error/elog.c      | 10 +++++
>  src/backend/utils/mmgr/mcxt.c       | 16 ++++++++
>  src/include/miscadmin.h             | 21 +++++++++-
>  src/include/storage/buf_internals.h |  1 +
>  src/include/storage/s_lock.h        | 56 ++++++++++++++++++-------
>  src/tools/pgindent/typedefs.list    |  2 +-
>  11 files changed, 176 insertions(+), 50 deletions(-)
>
> diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
> index 7d601bef6d..739a94209b 100644
> --- a/src/backend/storage/buffer/bufmgr.c
> +++ b/src/backend/storage/buffer/bufmgr.c
> @@ -5402,12 +5402,11 @@ rlocator_comparator(const void *p1, const void *p2)
>  uint32
>  LockBufHdr(BufferDesc *desc)
>  {
> -    SpinDelayStatus delayStatus;
>      uint32        old_buf_state;
>
>      Assert(!BufferIsLocal(BufferDescriptorGetBuffer(desc)));
>
> -    init_local_spin_delay(&delayStatus);
> +    init_local_spin_delay();
>
>      while (true)
>      {

Hm, I think this might make this code a bit more expensive. It's cheaper, both
in the number of instructions and their cost, to set variables on the stack
than in global memory - and it's already performance critical code.  I think
we need to optimize the code so that we only do init_local_spin_delay() once
we are actually spinning, rather than also on uncontended locks.



> @@ -5432,20 +5431,29 @@ LockBufHdr(BufferDesc *desc)
>  static uint32
>  WaitBufHdrUnlocked(BufferDesc *buf)
>  {
> -    SpinDelayStatus delayStatus;
>      uint32        buf_state;
>
> -    init_local_spin_delay(&delayStatus);
> +    /*
> +     * Suppose the buf will not be locked for a long time, setup a spin on
> +     * this.
> +     */
> +    init_local_spin_delay();

I don't know what this comment really means.



> +#ifdef USE_ASSERT_CHECKING
> +void
> +VerifyNoSpinLocksHeld(bool check_in_panic)
> +{
> +    if (!check_in_panic && spinStatus.in_panic)
> +        return;

Why do we need this?



> diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
> index aa06e49da2..c3fe75a41d 100644
> --- a/src/include/storage/s_lock.h
> +++ b/src/include/storage/s_lock.h
> @@ -652,7 +652,10 @@ tas(volatile slock_t *lock)
>   */
>  #if !defined(S_UNLOCK)
>  #define S_UNLOCK(lock)    \
> -    do { __asm__ __volatile__("" : : : "memory");  *(lock) = 0; } while (0)
> +    do { __asm__ __volatile__("" : : : "memory"); \
> +        ResetSpinLockStatus(); \
> +        *(lock) = 0; \
> +} while (0)
>  #endif

That seems like the wrong place. There are other definitions of S_UNLOCK(), so
we clearly can't do this here.


>  /*
> - * Support for spin delay which is useful in various places where
> - * spinlock-like procedures take place.
> + * Support for spin delay and spin misuse detection purpose.
> + *
> + * spin delay which is useful in various places where spinlock-like
> + * procedures take place.
> + *
> + * spin misuse is based on global spinStatus to know if a spin lock
> + * is held when a heavy operation is taking.
>   */
>  typedef struct
>  {
> @@ -846,22 +854,40 @@ typedef struct
>      const char *file;
>      int            line;
>      const char *func;
> -} SpinDelayStatus;
> +    bool        in_panic; /* works for spin lock misuse purpose. */
> +} SpinLockStatus;
>
> +extern PGDLLIMPORT SpinLockStatus spinStatus;
> +
> +#ifdef USE_ASSERT_CHECKING
> +extern void VerifyNoSpinLocksHeld(bool check_in_panic);
> +extern void ResetSpinLockStatus(void);
> +#else
> +#define VerifyNoSpinLocksHeld(check_in_panic) ((void) true)
> +#define ResetSpinLockStatus() ((void) true)
> +#endif
> +
> +/*
> + * start the spin delay logic and record the places where the spin lock
> + * is held which is also helpful for spin lock misuse detection purpose.
> + * init_spin_delay should be called with ResetSpinLockStatus in pair.
> + */
>  static inline void
> -init_spin_delay(SpinDelayStatus *status,
> -                const char *file, int line, const char *func)
> +init_spin_delay(const char *file, int line, const char *func)
>  {
> -    status->spins = 0;
> -    status->delays = 0;
> -    status->cur_delay = 0;
> -    status->file = file;
> -    status->line = line;
> -    status->func = func;
> +    /* it is not allowed to spin another lock when holding one already. */
> +    VerifyNoSpinLocksHeld(true);
> +    spinStatus.spins = 0;
> +    spinStatus.delays = 0;
> +    spinStatus.cur_delay = 0;
> +    spinStatus.file = file;
> +    spinStatus.line = line;
> +    spinStatus.func = func;
> +    spinStatus.in_panic = false;
>  }
>
> -#define init_local_spin_delay(status) init_spin_delay(status, __FILE__, __LINE__, __func__)
> -extern void perform_spin_delay(SpinDelayStatus *status);
> -extern void finish_spin_delay(SpinDelayStatus *status);
> +#define init_local_spin_delay() init_spin_delay( __FILE__, __LINE__, __func__)
> +extern void perform_spin_delay(void);
> +extern void finish_spin_delay(void);

As an API this doesn't quite make sense to me. For one, right now an
uncontended SpinLockAcquire afaict will not trigger this mechanism, as we
never call init_spin_delay(). It also adds overhead to optimized builds, as we
now maintain state in global variables instead of local memory.


Maybe we could have

- spinlock_prepare_acquire() - about to acquire a spinlock
  empty in optimized builds, asserts that no other spinlock is held etc

  This would get called in SpinLockAcquire(), LockBufHdr() etc.


- spinlock_finish_acquire() - have acquired spinlock
  empty in optimized builds, in assert builds sets variable indicating we're
  in spinlock

  This would get called in SpinLockRelease() etc.


- spinlock_finish_release()  - not holding the lock anymore

  This would get called by SpinLockRelease(), UnlockBufHdr()


- spinlock_prepare_spin() - about to spin waiting for a spinlock
  like the current init_spin_delay()

  This would get called in s_lock(), LockBufHdr() etc.


- spinlock_finish_spin() - completed waiting for a spinlock
  like the current finish_spin_delay()

  This would get called in s_lock(), LockBufHdr() etc.


I don't love the spinlock_ prefix, that could end up confusing people. I toyed
with "spinlike_" but am not in love either.


Greetings,

Andres Freund

Re: the s_lock_stuck on perform_spin_delay

От

Andy Fan

Дата:

08 февраля, 16:56:24

Hi, 

> There are some similarities between this and
> https://www.postgresql.org/message-id/20240207184050.rkvpuudq7huijmkb%40awork3.anarazel.de
> as described in that email.

Thanks for this information.

>
>
> Hm, I think this might make this code a bit more expensive. It's cheaper, both
> in the number of instructions and their cost, to set variables on the stack
> than in global memory - and it's already performance critical code.  I think
> we need to optimize the code so that we only do init_local_spin_delay() once
> we are actually spinning, rather than also on uncontended locks.

A great lession learnt, thanks for highlighting this!
>
>
>
>> @@ -5432,20 +5431,29 @@ LockBufHdr(BufferDesc *desc)
>>  static uint32
>>  WaitBufHdrUnlocked(BufferDesc *buf)
>>  {
>> -    SpinDelayStatus delayStatus;
>>      uint32        buf_state;
>>
>> -    init_local_spin_delay(&delayStatus);
>> +    /*
>> +     * Suppose the buf will not be locked for a long time, setup a spin on
>> +     * this.
>> +     */
>> +    init_local_spin_delay();
>
> I don't know what this comment really means.

Hmm, copy-paste error. Removed in v10.

>
>
>> +#ifdef USE_ASSERT_CHECKING
>> +void
>> +VerifyNoSpinLocksHeld(bool check_in_panic)
>> +{
>> +    if (!check_in_panic && spinStatus.in_panic)
>> +        return;
>
> Why do we need this?

We disallow errstart when a spin lock is held then there are two
speical cases need to be handled.

a). quickdie signal handler. The reason is explained with the below
comments. 

/*
 * It is possible that getting here when holding a spin lock already.
 * However current function needs some actions like elog which are
 * disallowed when holding a spin lock by spinlock misuse detection
 * system. So tell that system to treat this specially.
 */
spinStatus.in_panic = true;

b). VerifyNoSpinLocksHeld function.

if (spinStatus.func != NULL)
{
    /*
     * Now we have held a spin lock and then errstart is disallow, 
     * to avoid the endless recursive call of VerifyNoSpinLocksHeld
     * because of the VerifyNoSpinLocksHeld checks in errstart,
     * set spinStatus.in_panic to true to break the cycle.
     */
    spinStatus.in_panic = true;
    elog(PANIC, "A spin lock has been held at %s:%d",
         spinStatus.func, spinStatus.line);
}

>
>
>> diff --git a/src/include/storage/s_lock.h b/src/include/storage/s_lock.h
>> index aa06e49da2..c3fe75a41d 100644
>> --- a/src/include/storage/s_lock.h
>> +++ b/src/include/storage/s_lock.h
>> @@ -652,7 +652,10 @@ tas(volatile slock_t *lock)
>>   */
>>  #if !defined(S_UNLOCK)
>>  #define S_UNLOCK(lock)    \
>> -    do { __asm__ __volatile__("" : : : "memory");  *(lock) = 0; } while (0)
>> +    do { __asm__ __volatile__("" : : : "memory"); \
>> +        ResetSpinLockStatus(); \
>> +        *(lock) = 0; \
>> +} while (0)
>>  #endif
>
> That seems like the wrong place. There are other definitions of S_UNLOCK(), so
> we clearly can't do this here.

True, in the v10, the spinlock_finish_release() is called in
SpinLockRelease. The side effect is if user calls S_UNLOCK directly,
there would be something wrong because lack of call
spinlock_finish_release, I found this usage in regress.c (I changed it
to SpinLockRelease). but I think it is OK because:

1) in s_lock.h, there is clear comment say:

 *    NOTE: none of the macros in this file are intended to be called directly.
 *    Call them through the hardware-independent macros in spin.h.

2). If someone breaks the above rule, the issue can be found easily in
assert build.

3). It has no impact on release build.

>>
>> -#define init_local_spin_delay(status) init_spin_delay(status, __FILE__, __LINE__, __func__)
>> -extern void perform_spin_delay(SpinDelayStatus *status);
>> -extern void finish_spin_delay(SpinDelayStatus *status);
>> +#define init_local_spin_delay() init_spin_delay( __FILE__, __LINE__, __func__)
>> +extern void perform_spin_delay(void);
>> +extern void finish_spin_delay(void);
>
> As an API this doesn't quite make sense to me. For one, right now an
> uncontended SpinLockAcquire afaict will not trigger this mechanism, as we
> never call init_spin_delay().

Another great lesssion learnt, thanks for this as well!

>
>
> Maybe we could have

I moved on with the below suggestion with some small modification.

>
> - spinlock_prepare_acquire() - about to acquire a spinlock
>   empty in optimized builds, asserts that no other spinlock is held
> etc
>
>   This would get called in SpinLockAcquire(), LockBufHdr() etc.

"asserts that no other spinlock" has much more user cases than
SpinLockAcquire / LockBufHdr, I think sharing the same function will be
OK which is VerifyNoSpinLocksHeld function for now. 


> - spinlock_finish_acquire() - have acquired spinlock
>   empty in optimized builds, in assert builds sets variable indicating we're
>   in spinlock
>
>   This would get called in SpinLockRelease() etc.

I think you mean "SpinLockAcquire" here.

Matthias suggested "we need to 'arm' the checks just before we lock the
spin lock, and 'disarm' the checks just after we unlock the spin lock"
at [1], I'm kind of persuaded by that.  so I used
spinlock_prepare_acquire to set variable indicating we're in
spinlock. which one do you prefer now? 

>
> - spinlock_finish_release()  - not holding the lock anymore
>
>   This would get called by SpinLockRelease(), UnlockBufHdr()
>
>
> - spinlock_prepare_spin() - about to spin waiting for a spinlock
>   like the current init_spin_delay()
>
>   This would get called in s_lock(), LockBufHdr() etc.
>
>
> - spinlock_finish_spin() - completed waiting for a spinlock
>   like the current finish_spin_delay()
>
>   This would get called in s_lock(), LockBufHdr() etc.

All done in v10, for consistent purpose, I also renamed
perform_spin_delay to spinlock_perform_delay. 

I have got much more than what I expected before in this review process,
thank you very much about that!

[1]
https://www.postgresql.org/message-id/CAEze2WggP-2Dhocmdhp-LxBzic%3DMXRgGA_tmv1G_9n-PDt2MQg%40mail.gmail.com

-- 
Best Regards
Andy Fan

Вложения

v10-0001-Detect-misuse-of-spin-lock-automatically.patch

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: the s_lock_stuck on perform_spin_delay

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения