Report distinct wait events when waiting for WAL "operation"

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Report distinct wait events when waiting for WAL "operation"
Дата
Msg-id 20230717165544.4hdlynfv3jylnlm4@awork3.anarazel.de
обсуждение исходный текст
Ответы Re: Report distinct wait events when waiting for WAL "operation"  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Report distinct wait events when waiting for WAL "operation"  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
Hi,

In a number of workloads one can see two wait events prominently:
LWLock:WALWrite and LWLock:WALInsert. Unfortunately for both that is not very
informative:

LWLock:WALWrite can be reported because there is genuine contention on the
LWLock, or, more commonly, because several backends are waiting for another to
finish IO. In the latter case we are not actually waiting to acquire the lock,
we are waiting for the lock to be released *without* then acquiring it.

LWLock:WALInsert can be reported because there are not enough WALInsert locks
(c.f. NUM_XLOGINSERT_LOCKS) or because we are waiting for another backend to
finish copying a WAL record into wal_buffers.  In the latter case we are
therefore not waiting to acquire an LWLock.


I think both of these cases are relevant to distinguish from an operational
perspective. Secondarily, I've received many questions about making those
locks more scalable / granular, when in most of the cases the issue was not
actual lock contention.

Today it's surprisingly hard to figure out whether the issue is lock
contention or the speed of copying buffers for WAL insert locks / computing
the last prc of the CRC checksum.


Therefore I'm proposing that LWLockAcquireOrWait() and LWLockWaitForVar() not
use the "generic" LWLockReportWaitStart(), but use caller provided wait
events.  The attached patch adds two new wait events for the existing callers.

I waffled a bit about which wait event section to add these to. Ended up with
"IPC", but would be happy to change that.

WAIT_EVENT_WAL_WAIT_INSERT  WALWaitInsert   "Waiting for WAL record to be copied into buffers."
WAIT_EVENT_WAL_WAIT_WRITE   WALWaitWrite    "Waiting for WAL buffers to be written or flushed to disk."


Previously it was e.g. not really possible to distinguish that something like
this:

┌────────────────┬─────────────────┬────────────┬───────┐
│  backend_type  │ wait_event_type │ wait_event │ count │
├────────────────┼─────────────────┼────────────┼───────┤
│ client backend │ LWLock          │ WALInsert  │    32 │
│ client backend │ (null)          │ (null)     │     9 │
│ walwriter      │ IO              │ WALWrite   │     1 │
│ client backend │ Client          │ ClientRead │     1 │
│ client backend │ LWLock          │ WALWrite   │     1 │
└────────────────┴─────────────────┴────────────┴───────┘

is a workload with a very different bottleneck than this:

┌────────────────┬─────────────────┬───────────────┬───────┐
│  backend_type  │ wait_event_type │  wait_event   │ count │
├────────────────┼─────────────────┼───────────────┼───────┤
│ client backend │ IPC             │ WALWaitInsert │    22 │
│ client backend │ LWLock          │ WALInsert     │    13 │
│ client backend │ LWLock          │ WALBufMapping │     5 │
│ walwriter      │ (null)          │ (null)        │     1 │
│ client backend │ Client          │ ClientRead    │     1 │
│ client backend │ (null)          │ (null)        │     1 │
└────────────────┴─────────────────┴───────────────┴───────┘

even though they are very different

FWIW, the former is bottlenecked by the number of WAL insertion locks, the
second is bottlenecked by copying WAL into buffers due to needing to flush
them.

Greetings,

Andres Freund

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Thom Brown
Дата:
Сообщение: Re: generic plans and "initial" pruning
Следующее
От: Garrett Thornburg
Дата:
Сообщение: Re: Looking for context around which event triggers are permitted