Re: the s_lock_stuck on perform_spin_delay

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: the s_lock_stuck on perform_spin_delay
Дата
Msg-id 87msstvper.fsf@163.com
обсуждение исходный текст
Ответ на Re: the s_lock_stuck on perform_spin_delay  (Andres Freund <andres@anarazel.de>)
Ответы Re: the s_lock_stuck on perform_spin_delay  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi, 

> Hi,
>
> On 2024-01-22 15:18:35 +0800, Andy Fan wrote:
>> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
>> quickdie, however this is bad not only because it doesn't work on
>> Windows, but also it has too poor performance even it impacts on
>> USE_ASSERT_CHECKING build only. In v8, I introduced a new global
>> variable quickDieInProgress to handle this.
>
> For reasons you already noted, using sigismember() isn't viable. But I am not
> convinced by quickDieInProgress either. I think we could just reset the state
> for the spinlock check in the code handling PANIC, perhaps via a helper
> function in spin.c.

Handled with the action for your suggestion #3. 

>> +void
>> +VerifyNoSpinLocksHeld(void)
>> +{
>> +#ifdef USE_ASSERT_CHECKING
>> +    if (last_spin_lock_file != NULL)
>> +        elog(PANIC, "A spin lock has been held at %s:%d",
>> +             last_spin_lock_file, last_spin_lock_lineno);
>> +#endif
>> +}
>
> I think the #ifdef for this needs to be in the header, not here. Otherwise we
> add a pointless external function call to a bunch of performance critical
> code.
>
Done.

>> diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
>> index 7d601bef6d..c600a113cf 100644
>> --- a/src/backend/storage/buffer/bufmgr.c
>> +++ b/src/backend/storage/buffer/bufmgr.c
>> @@ -5409,6 +5409,7 @@ LockBufHdr(BufferDesc *desc)
>>  
>>      init_local_spin_delay(&delayStatus);
>>  
>> +    START_SPIN_LOCK();
>>      while (true)
>>      {
>>          /* set BM_LOCKED flag */
>
> Seems pretty odd that we now need init_local_spin_delay() and
> START_SPIN_LOCK(). Note that init_local_spin_delay() also wraps handling of
> __FILE__, __LINE__ etc, so it seems we're duplicating state here.

Thanks for catching this! Based on the feedbacks so far, it is not OK to
acquire another spin lock when holding one already. So I refactor the
code like this:

 /*
- * Support for spin delay which is useful in various places where
- * spinlock-like procedures take place.
+ * Support for spin delay and spin misuse detection purpose.
+ *
+ * spin delay which is useful in various places where spinlock-like
+ * procedures take place.
+ *
+ * spin misuse is based on global spinStatus to know if a spin lock
+ * is held when a heavy operation is taking.
  */
 typedef struct
 {
@@ -846,22 +854,40 @@ typedef struct
     const char *file;
     int            line;
     const char *func;
-} SpinDelayStatus;
+    bool        in_panic; /* works for spin lock misuse purpose. */
+} SpinLockStatus;

+extern PGDLLIMPORT SpinLockStatus spinStatus;

Now all the init_local_spin_delay, perform_spin_delay, finish_spin_delay
access the same global variable spinStatus. and just 2 extra functions
added (previously have 3). they are:

extern void VerifyNoSpinLocksHeld(bool check_in_panic);
extern void ResetSpinLockStatus(void);

The panic check stuff is still added into spinLockStatus. 

v9 attached.

-- 
Best Regards
Andy Fan


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Richard Guo
Дата:
Сообщение: Re: Trivial revise for the check of parameterized partial paths
Следующее
От: Andy Fan
Дата:
Сообщение: Re: the s_lock_stuck on perform_spin_delay