Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system
Дата
Msg-id 20160314061420.GA1064464@tornado.leadboat.com
обсуждение исходный текст
Ответ на Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
[Aside: your new mail editor is rewrapping lines in quoted material, and the
result is messy.  I have rerewrapped one paragraph below.]

On Mon, Mar 14, 2016 at 02:00:03AM +0100, Tomas Vondra wrote:
> On Sun, 2016-03-13 at 18:46 -0400, Noah Misch wrote:
> > I've not attempted to study the behavior on slow hardware.  Instead, my
> > report used stat-coalesce-v1.patch[1] to simulate slow writes.  (That
> > diagnostic patch no longer applies cleanly, so I'm attaching a rebased
> > version.  I've changed the patch name from "stat-coalesce" to
> > "slow-stat-simulate" to more-clearly distinguish it from the
> > "pgstat-coalesce" patch.)  Problems remain after applying your patch;
> > consider "VACUUM pg_am" behavior:
> > 
> > 9.2 w/ stat-coalesce-v1.patch:
> >   VACUUM returns in 3s, stats collector writes each file 1x over 3s
> > HEAD w/ slow-stat-simulate-v2.patch:
> >   VACUUM returns in 3s, stats collector writes each file 5x over 15s
> > HEAD w/ slow-stat-simulate-v2.patch and your patch:
> >   VACUUM returns in 10s, stats collector writes no files over 10s
> 
> Oh damn, the timestamp comparison in pgstat_recv_inquiry should be in
> the opposite direction. After fixing that "VACUUM pg_am" completes in 3
> seconds and writes each file just once.

That fixes things.  "make check" passes under an 8s stats write time.

> --- a/src/backend/postmaster/pgstat.c
> +++ b/src/backend/postmaster/pgstat.c
> @@ -4836,6 +4836,20 @@ pgstat_recv_inquiry(PgStat_MsgInquiry *msg, int len)
>      }
>  
>      /*
> +     * Ignore requests that are already resolved by the last write.
> +     *
> +     * We discard the queue of requests at the end of pgstat_write_statsfiles(),
> +     * so the requests already waiting on the UDP socket at that moment can't
> +     * be discarded in the previous loop.
> +     *
> +     * XXX Maybe this should also care about the clock skew, just like the
> +     *     block a few lines down.

Yes, it should.  (The problem is large (>~100s), backward clock resets, not
skew.)  A clock reset causing "msg->clock_time < dbentry->stats_timestamp"
will usually also cause "msg->cutoff_time < dbentry->stats_timestamp".  Such
cases need the correction a few lines down.

The other thing needed here is to look through and update comments about
last_statrequests.  For example, this loop is dead code due to us writing
files as soon as we receive one inquiry:
/* * Find the last write request for this DB.  If it's older than the * request's cutoff time, update it; otherwise
there'snothing to do. * * Note that if a request is found, we return early and skip the below * check for clock skew.
Thisis okay, since the only way for a DB * request to be present in the list is that we have been here since the * last
writeround. */slist_foreach(iter, &last_statrequests) ...
 

I'm okay keeping the dead code for future flexibility, but the comment should
reflect that.

Thanks,
nm



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission denied”
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: Logical decoding slots can go backwards when used from SQL, docs are wrong