Re: BUG #12071: Stat collector went crasy (50MB/s constant writes)
От | Maxim Boguk |
---|---|
Тема | Re: BUG #12071: Stat collector went crasy (50MB/s constant writes) |
Дата | |
Msg-id | CAK-MWwTmJV4VprhAfCyTKRLndYbwbToF-Z0JYv6_o=qq3VYxyQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #12071: Stat collector went crasy (50MB/s constant writes) (Maxim Boguk <maxim.boguk@gmail.com>) |
Ответы |
Re: BUG #12071: Stat collector went crasy (50MB/s constant
writes)
|
Список | pgsql-bugs |
> On one of my servers stat collector went crasy on idle system. >>> It constantly write around 50MB/s on idle database (no activity at all). >>> >> > OK there are results of some entry-level gdb kung-fu: > > problem point is: > /* > * Write the stats file if a new request has arrived that is > not > * satisfied by existing file. > */ > if (last_statwrite < last_statrequest) > pgstat_write_statsfile(false); > > on some loop iteration difference between last_statrequest and > last_statwrite is around 23000-29000 (e.g. 20-30ms). > > Hm may be there some clock drift between CPU cores? > What's made this situation interesting that the fact the code never got inside /* * If there is clock skew between backends and the collector, we could * receive a stats request time that's in the future. If so, complain * and reset last_statrequest. Resetting ensures that no inquiry * message can cause more than one stats file write to occur. */ if (last_statrequest > last_statwrite) ... block. Watchpoint on the (last_statwrite < last_statrequest) condition repetable fire ONLY inside: pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len) { if (msg->inquiry_time > last_statrequest) last_statrequest = msg->inquiry_time; } function. Now just some musings/theory (I might be dead wrong of course). Lets suppose there are no clock skew actually and check execution flow. Main loop executed: if (last_statwrite < last_statrequest) pgstat_write_statsfile(false); and updated last_statwrite value. However, it took some time because of file size. During that period (while pgstat busy with writes) some backend issued PGSTAT_MTYPE_INQUIRY request which will be succesfuly accepted on the next code lines at len = recv(pgStatSock, (char *) &msg,... And as a result the last_statrequest will be updated to the msg->inquiry_time value. And this value will be definitely > than last_statwrite (because this request issued after pgstat worker stared writing new copy of the stat file). So these could be case when this situation could happen without time travel/clock drift. Kind Regards, Maksym
В списке pgsql-bugs по дате отправления: