Обсуждение: BUG #18385: Assert("strategy_delta >= 0") in BgBufferSync() fails due to race condition

Поиск
Список
Период
Сортировка

BUG #18385: Assert("strategy_delta >= 0") in BgBufferSync() fails due to race condition

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      18385
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 16.2
Operating system:   Ubuntu 22.04
Description:

With a small shared_buffers value and a short bgwriter_delay:
shared_buffers = '1MB'
bgwriter_delay = 10

processing concurrent writing workload like:
pgbench -i
pgbench -t 10000 -c 40

on a slow machine leads to:
number of transactions actually processed: 103642/400000
...
tps = 187.796284 (without initial connection time)
pgbench: error: Run was aborted; the above results are incomplete.
...
TRAP: failed Assert("strategy_delta >= 0"), File: "bufmgr.c", Line: 2836,
PID: 20941
postgres: background writer (ExceptionalCondition+0x52)[0x5581a8dd1677]
postgres: background writer (BgBufferSync+0xb6)[0x5581a8c5b97a]
postgres: background writer (BackgroundWriterMain+0x20b)[0x5581a8bf117a]
postgres: background writer (AuxiliaryProcessMain+0x175)[0x5581a8befa29]
postgres: background writer (+0x423cff)[0x5581a8bf5cff]
postgres: background writer (PostmasterMain+0x1127)[0x5581a8bf916f]
postgres: background writer (main+0x227)[0x5581a8b1d4d5]

To ease reproduction, adding the following delay is recommended:
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -417,2 +417,3 @@ StrategySyncStart(...)
        }
+pg_usleep(300);
        SpinLockRelease(&StrategyControl->buffer_strategy_lock);

(Initially observed during the test 027_stream_regress (which runs with
shared_buffers = '1MB') with the minimal bgwriter_delay on a slow
dual-core machine, where one test run takes around 1000 sec.)

Reproduced on REL_12_STABLE .. master. In fact, the issue reproduced
starting from d72731a70.


Hi Alexander,
   I haven't been able to reproduce this issue on my machine(2 vCPU, 2GB memory).

 Can you reproduce this issue reliably on your machine?

PG Bug reporting form <noreply@postgresql.org> 于2024年3月11日周一 03:08写道:
The following bug has been logged on the website:

Bug reference:      18385
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 16.2
Operating system:   Ubuntu 22.04
Description:       

With a small shared_buffers value and a short bgwriter_delay:
shared_buffers = '1MB'
bgwriter_delay = 10

processing concurrent writing workload like:
pgbench -i
pgbench -t 10000 -c 40

on a slow machine leads to:
number of transactions actually processed: 103642/400000
...
tps = 187.796284 (without initial connection time)
pgbench: error: Run was aborted; the above results are incomplete.
...
TRAP: failed Assert("strategy_delta >= 0"), File: "bufmgr.c", Line: 2836,
PID: 20941
postgres: background writer (ExceptionalCondition+0x52)[0x5581a8dd1677]
postgres: background writer (BgBufferSync+0xb6)[0x5581a8c5b97a]
postgres: background writer (BackgroundWriterMain+0x20b)[0x5581a8bf117a]
postgres: background writer (AuxiliaryProcessMain+0x175)[0x5581a8befa29]
postgres: background writer (+0x423cff)[0x5581a8bf5cff]
postgres: background writer (PostmasterMain+0x1127)[0x5581a8bf916f]
postgres: background writer (main+0x227)[0x5581a8b1d4d5]

To ease reproduction, adding the following delay is recommended:
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -417,2 +417,3 @@ StrategySyncStart(...)
        }
+pg_usleep(300);
        SpinLockRelease(&StrategyControl->buffer_strategy_lock);

(Initially observed during the test 027_stream_regress (which runs with
shared_buffers = '1MB') with the minimal bgwriter_delay on a slow
dual-core machine, where one test run takes around 1000 sec.)

Reproduced on REL_12_STABLE .. master. In fact, the issue reproduced
starting from d72731a70.



--
Tender Wang
OpenPie:  https://en.openpie.com/
Hi Tender Wang,

13.03.2024 09:50, Tender Wang wrote:
> Hi Alexander,
>    I haven't been able to reproduce this issue on my machine(2 vCPU, 2GB memory).
>
>  Can you reproduce this issue reliably on your machine?

Thanks for your attention to this issue!

On my 12-core workstation, where pgbench shows approximately 2500 tps,
`pgbench -t 10000 -c 40` failed on iterations 1, 1, 3:
...
number of transactions actually processed: 398639/400000
number of failed transactions: 0 (0.000%)
latency average = 15.931 ms
initial connection time = 56.806 ms
tps = 2510.845701 (without initial connection time)
pgbench: error: Run was aborted; the above results are incomplete.
...

The server built with gcc,
CPPFLAGS="-O0" ./configure --enable-debug --enable-cassert ...

I have also intensified bgwriter as follows:
          rc = WaitLatch(MyLatch,
                         WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
-                       BgWriterDelay /* ms */ , WAIT_EVENT_BGWRITER_MAIN);
+                       1 /* ms */ , WAIT_EVENT_BGWRITER_MAIN);

That is, I have the attached modification applied and the following in my
extra.config:
fsync = off
autovacuum = off
shared_buffers = '1MB'

Best regards,
Alexander
Вложения


Alexander Lakhin <exclusion@gmail.com> 于2024年3月13日周三 16:00写道:
Hi Tender Wang,

13.03.2024 09:50, Tender Wang wrote:
> Hi Alexander,
>    I haven't been able to reproduce this issue on my machine(2 vCPU, 2GB memory).
>
>  Can you reproduce this issue reliably on your machine?

Thanks for your attention to this issue!

On my 12-core workstation, where pgbench shows approximately 2500 tps,
`pgbench -t 10000 -c 40` failed on iterations 1, 1, 3:
...
number of transactions actually processed: 398639/400000
number of failed transactions: 0 (0.000%)
latency average = 15.931 ms
initial connection time = 56.806 ms
tps = 2510.845701 (without initial connection time)
pgbench: error: Run was aborted; the above results are incomplete.
...

The server built with gcc,
CPPFLAGS="-O0" ./configure --enable-debug --enable-cassert ...

I have also intensified bgwriter as follows:
          rc = WaitLatch(MyLatch,
                         WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
-                       BgWriterDelay /* ms */ , WAIT_EVENT_BGWRITER_MAIN);
+                       1 /* ms */ , WAIT_EVENT_BGWRITER_MAIN);

That is, I have the attached modification applied and the following in my
extra.config:
fsync = off
autovacuum = off
shared_buffers = '1MB'

Using your provided patch and configuration, I can reproduce this issue easily on my mashine(2 vCPU, 2GB).
I went throug the  StrategySyncStart() and ClockSweepTick() code. After d72731a70 commit, operation on StrategyControl->nextVictimBuffer
doesn't need to get the StrategyControl->buffer_strategy_lock, but StrategyControl->completePasses still needs lock.

Before d72731a70, if bgwriter gets the StrategyControl->buffer_strategy_lock spinlock in StrategySyncStart(), backends can't add StrategyControl->nextVictimBuffer,
then bgwriter can get a consistent  value consisting of nextVictimBuffer and completePasses. 

But now, even though bgwriter gets the spinlock, backends also could add the StrategyControl->nextVictimBuffer. In corner cases, bgwriter will see a litter StrategyControl->nextVictimBuffer
value due to StrategyControl->nextVictimBuffer wraparound. But StrategyControl->completePasses didn't  update. So the Assert(strategy_delta >= 0) will trigger. 

I have two solutions in my head:
1. remove the Assert, but I'm not srue bgwriter write strategy should change if strategy_delta < 0.
2. add more check in BgBufferSync() just like below:
   /*
    * Since nextVictimBuffer in StrategyControl has been atomic.
    * So its operation would not need to get buffer_strategy_lock.
   * In extreme circumstances, StrategySyncStart would see not consistent
   * value consisting of nextVictimBuffer and completePasses.
   * So we add one to passes_delta to make strategy_delta >= 0.
   */
  if (passes_delta == 0 && strategy_delta < 0)
       passes_delta++;

I test two times using the second solution, not trigger Assert again. Need more times to test.

--
Tender Wang
OpenPie:  https://en.openpie.com/
Sorry, the second solution is still not fix the problem. I met Assert failed:
strategy_passes is 100964, but prev_strategy_passes is 100965.  The added check was not satisfied.
Would it happen that current strategy_passes is less than prev_strategy_passes.

Tender Wang <tndrwang@gmail.com> 于2024年3月15日周五 16:05写道:


Alexander Lakhin <exclusion@gmail.com> 于2024年3月13日周三 16:00写道:
Hi Tender Wang,

13.03.2024 09:50, Tender Wang wrote:
> Hi Alexander,
>    I haven't been able to reproduce this issue on my machine(2 vCPU, 2GB memory).
>
>  Can you reproduce this issue reliably on your machine?

Thanks for your attention to this issue!

On my 12-core workstation, where pgbench shows approximately 2500 tps,
`pgbench -t 10000 -c 40` failed on iterations 1, 1, 3:
...
number of transactions actually processed: 398639/400000
number of failed transactions: 0 (0.000%)
latency average = 15.931 ms
initial connection time = 56.806 ms
tps = 2510.845701 (without initial connection time)
pgbench: error: Run was aborted; the above results are incomplete.
...

The server built with gcc,
CPPFLAGS="-O0" ./configure --enable-debug --enable-cassert ...

I have also intensified bgwriter as follows:
          rc = WaitLatch(MyLatch,
                         WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
-                       BgWriterDelay /* ms */ , WAIT_EVENT_BGWRITER_MAIN);
+                       1 /* ms */ , WAIT_EVENT_BGWRITER_MAIN);

That is, I have the attached modification applied and the following in my
extra.config:
fsync = off
autovacuum = off
shared_buffers = '1MB'

Using your provided patch and configuration, I can reproduce this issue easily on my mashine(2 vCPU, 2GB).
I went throug the  StrategySyncStart() and ClockSweepTick() code. After d72731a70 commit, operation on StrategyControl->nextVictimBuffer
doesn't need to get the StrategyControl->buffer_strategy_lock, but StrategyControl->completePasses still needs lock.

Before d72731a70, if bgwriter gets the StrategyControl->buffer_strategy_lock spinlock in StrategySyncStart(), backends can't add StrategyControl->nextVictimBuffer,
then bgwriter can get a consistent  value consisting of nextVictimBuffer and completePasses. 

But now, even though bgwriter gets the spinlock, backends also could add the StrategyControl->nextVictimBuffer. In corner cases, bgwriter will see a litter StrategyControl->nextVictimBuffer
value due to StrategyControl->nextVictimBuffer wraparound. But StrategyControl->completePasses didn't  update. So the Assert(strategy_delta >= 0) will trigger. 

I have two solutions in my head:
1. remove the Assert, but I'm not srue bgwriter write strategy should change if strategy_delta < 0.
2. add more check in BgBufferSync() just like below:
   /*
    * Since nextVictimBuffer in StrategyControl has been atomic.
    * So its operation would not need to get buffer_strategy_lock.
   * In extreme circumstances, StrategySyncStart would see not consistent
   * value consisting of nextVictimBuffer and completePasses.
   * So we add one to passes_delta to make strategy_delta >= 0.
   */
  if (passes_delta == 0 && strategy_delta < 0)
       passes_delta++;

I test two times using the second solution, not trigger Assert again. Need more times to test.

--
Tender Wang
OpenPie:  https://en.openpie.com/


--
Tender Wang
OpenPie:  https://en.openpie.com/