Re: Better way of dealing with pgstat wait timeout during buildfarm runs?
От | Michael Paquier |
---|---|
Тема | Re: Better way of dealing with pgstat wait timeout during buildfarm runs? |
Дата | |
Msg-id | CAB7nPqRyOVYiLkrQkjrB1ozjoNaPBK_-ApV_8L+vnAKQHju=-g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Better way of dealing with pgstat wait timeout during buildfarm runs? (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: Better way of dealing with pgstat wait timeout during
buildfarm runs?
|
Список | pgsql-hackers |
On Wed, Jan 21, 2015 at 1:08 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 25.12.2014 22:28, Tomas Vondra wrote: >> On 25.12.2014 21:14, Andres Freund wrote: >> >>> That's indeed odd. Seems to have been lost when the statsfile was >>> split into multiple files. Alvaro, Tomas? >> >> The goal was to keep the logic as close to the original as possible. >> IIRC there were "pgstat wait timeout" issues before, and in most cases >> the conclusion was that it's probably because of overloaded I/O. >> >> But maybe there actually was another bug, and it's entirely possible >> that the split introduced a new one, and that's what we're seeing now. >> The strange thing is that the split happened ~2 years ago, which is >> inconsistent with the sudden increase of this kind of issues. So maybe >> something changed on that particular animal (a failing SD card causing >> I/O stalls, perhaps)? >> >> Anyway, I happen to have a spare Raspberry PI, so I'll try to reproduce >> and analyze the issue locally. But that won't happen until January. > > I've tried to reproduce this on my Raspberry PI 'machine' and it's not > very difficult to trigger this. About 7 out of 10 'make check' runs fail > because of 'pgstat wait timeout'. > > All the occurences I've seen were right after some sort of VACUUM > (sometimes plain, sometimes ANALYZE or FREEZE), and the I/O at the time > looked something like this: > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s > avgrq-sz avgqu-sz await r_await w_await svctm %util > mmcblk0 0.00 75.00 0.00 8.00 0.00 36.00 > 9.00 5.73 15633.75 0.00 15633.75 125.00 100.00 > > So pretty terrible (this is a Class 4 SD card, supposedly able to handle > 4 MB/s). If hamster had faulty SD card, it might have been much worse, I > guess. By experience, a class 10 is at least necessary, with a minimum amount of memory to minimize the apparition of those warnings, hamster having now a 8GB class 10 card. -- Michael
В списке pgsql-hackers по дате отправления: