Обсуждение: Tracking replication slot "blockings"

Поиск
Список
Период
Сортировка

Tracking replication slot "blockings"

От
Magnus Hagander
Дата:
I'm thinking it could be interesting to know how many times (or in some other useful unit than "times" - how often) a specific replication slot has "blocked" xlog rotation. Since this AFAIK only happens during checkpoints, it seems it should be "reasonably cheap" to track? It would serve as an indicator of which slave(s) are having enough trouble keeping up to potentially cause issues.

Not having looked at that code at all yet, would this be something that's simple to add?

Or is it a silly idea? :)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: Tracking replication slot "blockings"

От
Andres Freund
Дата:
Hi,

On 2014-04-16 18:51:41 +0200, Magnus Hagander wrote:
> I'm thinking it could be interesting to know how many times (or in some
> other useful unit than "times" - how often) a specific replication slot has
> "blocked" xlog rotation. Since this AFAIK only happens during checkpoints,
> it seems it should be "reasonably cheap" to track? It would serve as an
> indicator of which slave(s) are having enough trouble keeping up to
> potentially cause issues.

The xlog removal code just check the "global minimum" required LSN - it
doesn't check the individual slots. So you'd need to add a bit more code
to that location. But it'd be easy.

But I think I'd just monitor/graph the byte difference for all slots
using pg_replication_slots...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Tracking replication slot "blockings"

От
Magnus Hagander
Дата:
On Wed, Apr 16, 2014 at 6:56 PM, Andres Freund <andres@2ndquadrant.com> wrote:
Hi,

On 2014-04-16 18:51:41 +0200, Magnus Hagander wrote:
> I'm thinking it could be interesting to know how many times (or in some
> other useful unit than "times" - how often) a specific replication slot has
> "blocked" xlog rotation. Since this AFAIK only happens during checkpoints,
> it seems it should be "reasonably cheap" to track? It would serve as an
> indicator of which slave(s) are having enough trouble keeping up to
> potentially cause issues.

The xlog removal code just check the "global minimum" required LSN - it
doesn't check the individual slots. So you'd need to add a bit more code
to that location. But it'd be easy.

Do we have statistics there somewhere - how often that global minimum blocks something? That on it's own might be a start :)
 

But I think I'd just monitor/graph the byte difference for all slots
using pg_replication_slots...

Yeah, that would work when monitored continously. I was more looking for the view of "hey, could this be what happened" into a system that did not previously have any monitoring installed and therefor no such history.  


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: Tracking replication slot "blockings"

От
Andres Freund
Дата:
On 2014-04-16 19:09:09 +0200, Magnus Hagander wrote:
> On Wed, Apr 16, 2014 at 6:56 PM, Andres Freund <andres@2ndquadrant.com>wrote:
> > The xlog removal code just check the "global minimum" required LSN - it
> > doesn't check the individual slots. So you'd need to add a bit more code
> > to that location. But it'd be easy.
> >
> 
> Do we have statistics there somewhere - how often that global minimum
> blocks something? That on it's own might be a start :)

Nope. Check xlog.c:KeepLogSeg(), it's pretty simple stuff ;). It's the
same place where wal_keep_segments is enforced...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services