Re: [HACKERS] Restricting maximum keep segments by repslots

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: [HACKERS] Restricting maximum keep segments by repslots
Дата
Msg-id 20170907.215956.110216588.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] Restricting maximum keep segments by repslots  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Ответы Re: [HACKERS] Restricting maximum keep segments by repslots  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Список pgsql-hackers
Hello,

At Thu, 07 Sep 2017 14:12:12 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20170907.141212.227032666.horiguchi.kyotaro@lab.ntt.co.jp>
> > I would like a flag in pg_replication_slots, and possibly also a
> > numerical column that indicates how far away from the critical point
> > each slot is.  That would be great for a monitoring system.
> 
> Great! I'll do that right now.

Done.

In the attached patch on top of the previous patch, I added two
columns in pg_replication_slots, "live" and "distance". The first
indicates the slot will "live" after the next checkpoint. The
second shows the how many bytes checkpoint lsn can advance before
the slot will "die", or how many bytes the slot have lost after
"death".


Setting wal_keep_segments = 1 and max_slot_wal_keep_size = 16MB.

=# select slot_name, restart_lsn, pg_current_wal_lsn(), live, distance from pg_replication_slots;

slot_name | restart_lsn | pg_current_wal_lsn | live | distance  
-----------+-------------+--------------------+------+-----------s1        | 0/162D388   | 0/162D3C0          | t    |
0/29D2CE8

This shows that checkpoint can advance 0x29d2ce8 bytes before the
slot will die even if the connection stalls.
s1        | 0/4001180   | 0/6FFF2B8          | t    | 0/DB8

Just before the slot loses sync.
s1        | 0/4001180   | 0/70008A8          | f    | 0/FFEE80

The checkpoint after this removes some required segments.

2017-09-07 19:04:07.677 JST [13720] WARNING:  restart LSN of replication slots is ignored by checkpoint
2017-09-07 19:04:07.677 JST [13720] DETAIL:  Some replication slots have lost required WAL segnents to continue by up
to1 segments.
 

If max_slot_wal_keep_size if not set (0), live is always true and
distance is NULL.

slot_name | restart_lsn | pg_current_wal_lsn | live | distance  
-----------+-------------+--------------------+------+-----------s1        | 0/4001180   | 0/73117A8          | t    |




- The name (or its content) of the new columns should be arguable.

- pg_replication_slots view takes LWLock on ControlFile and spinlock on XLogCtl for every slot. But seems difficult to
reduceit..
 

- distance seems mitakenly becomes 0/0 for certain condition..

- The result seems almost right but more precise check needed. (Anyway it cannot be perfectly exact.);

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ashutosh Bapat
Дата:
Сообщение: Re: [HACKERS] Adding support for Default partition in partitioning
Следующее
От: Alexey Chernyshov
Дата:
Сообщение: Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans