Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system
Дата
Msg-id 20160315020441.GB1092972@tornado.leadboat.com
обсуждение исходный текст
Ответ на Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: Re: PATCH: Split stats file per database WAS: autovacuum stress-testing our system  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On Mon, Mar 14, 2016 at 01:33:08PM +0100, Tomas Vondra wrote:
> On 03/14/2016 07:14 AM, Noah Misch wrote:
> >On Mon, Mar 14, 2016 at 02:00:03AM +0100, Tomas Vondra wrote:
> >>+     * XXX Maybe this should also care about the clock skew, just like the
> >>+     *     block a few lines down.
> >
> >Yes, it should.  (The problem is large (>~100s), backward clock resets, not
> >skew.)  A clock reset causing "msg->clock_time < dbentry->stats_timestamp"
> >will usually also cause "msg->cutoff_time < dbentry->stats_timestamp".  Such
> >cases need the correction a few lines down.
> 
> I'll look into that. I have to admit I have a hard time reasoning about the
> code handling clock skew, so it might take some time, though.

No hurry; it would be no problem to delay this several months.

> >The other thing needed here is to look through and update comments about
> >last_statrequests.  For example, this loop is dead code due to us writing
> >files as soon as we receive one inquiry:
> >
> >    /*
> >     * Find the last write request for this DB.  If it's older than the
> >     * request's cutoff time, update it; otherwise there's nothing to do.
> >     *
> >     * Note that if a request is found, we return early and skip the below
> >     * check for clock skew.  This is okay, since the only way for a DB
> >     * request to be present in the list is that we have been here since the
> >     * last write round.
> >     */
> >    slist_foreach(iter, &last_statrequests) ...
> >
> >I'm okay keeping the dead code for future flexibility, but the comment should
> >reflect that.
> 
> Yes, that's another thing that I'd like to look into. Essentially the
> problem is that we always process the inquiries one by one, so we never
> actually see a list with more than a single element. Correct?

Correct.

> I think the best way to address that is to peek is to first check how much
> data is in the UDP queue, and then fetching all of that before actually
> doing the writes. Peeking at the number of requests first (or even some
> reasonable hard-coded limit) should should prevent starving the inquirers in
> case of a steady stream or inquiries.

Now that you mention it, a hard-coded limit sounds good: write the files for
pending inquiries whenever the socket empties or every N messages processed,
whichever comes first.  I don't think the amount of pending UDP data is
portably available, and I doubt it's important.  Breaking every, say, one
thousand messages will make the collector predictably responsive to inquiries,
and that is the important property.

I would lean toward making this part 9.7-only; it would be a distinct patch
from the one previously under discussion.

Thanks,
nm



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: Fix for OpenSSL error queue bug
Следующее
От: James Sewell
Дата:
Сообщение: Re: Choosing parallel_degree