Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.
От | Fujii Masao |
---|---|
Тема | Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested. |
Дата | |
Msg-id | 5289df2d-acce-ca30-9a5e-ab75f621cc29@oss.nttdata.com обсуждение исходный текст |
Ответ на | Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested. (Masahiro Ikeda <ikedamsh@oss.nttdata.com>) |
Ответы |
Re: make the stats collector shutdown without writing the statsfiles if the immediate shutdown is requested.
|
Список | pgsql-hackers |
On 2021/03/25 9:31, Masahiro Ikeda wrote: > > > On 2021/03/24 18:36, Fujii Masao wrote: >> >> >> On 2021/03/24 3:51, Andres Freund wrote: >>> Hi, >>> >>> On 2021-03-23 15:50:46 +0900, Fujii Masao wrote: >>>> This fact makes me wonder that if we collect the statistics about WAL writing >>>> from walreceiver as we discussed in other thread, the stats collector should >>>> be invoked at more earlier stage. IIUC walreceiver can be invoked before >>>> PMSIGNAL_BEGIN_HOT_STANDBY is sent. >>> >>> FWIW, in the shared memory stats patch the stats subsystem is >>> initialized early on by the startup process. >> >> This is good news! > > Fujii-san, Andres-san, > Thanks for your comments! > > I didn't think about the start order. From the point of view, I noticed that > the current source code has two other concerns. > > > 1. This problem is not only for the wal receiver. > > The problem which the wal receiver starts before the stats collector > is launched during archive recovery is not only for the the wal receiver but > also the checkpointer and the bgwriter. Before starting redo, the startup > process sends the postmaster "PMSIGNAL_RECOVERY_STARTED" signal to launch the > checkpointer and the bgwriter to be able to perform creating restartpoint. > > Although the socket for communication between the stats collector and the > other processes is made in earlier stage via pgstat_init(), I agree to make > the stats collector starts earlier stage is defensive. BTW, in my > environments(linux, net.core.rmem_default = 212992), the socket can buffer > almost 300 WAL stats messages. This mean that, as you said, if the redo phase > is too long, it can lost the messages easily. > > > 2. To make the stats clear in redo phase. > > The statistics can be reset after the wal receiver, the checkpointer and > the wal writer are started in redo phase. So, it's not enough the stats > collector is invoked at more earlier stage. We need to fix it. > > > > (I hope I am not missing something.) > Thanks to Andres-san's work([1]), the above problems will be handle in the > shared memory stats patch. First problem will be resolved since the stats are > collected in shared memory, so the stats collector process is unnecessary > itself. Second problem will be resolved to remove the reset code because the > temporary stats file won't generated, and if the permanent stats file > corrupted, just recreate it. Yes. So we should wait for the shared memory stats patch to be committed before working on walreceiver stats patch more? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
В списке pgsql-hackers по дате отправления: