Re: more descriptive message for process termination due to max_slot_wal_keep_size
От | Drouvot, Bertrand |
---|---|
Тема | Re: more descriptive message for process termination due to max_slot_wal_keep_size |
Дата | |
Msg-id | 03ee95c3-6496-90b7-5c8b-13fe69ee73dd@amazon.com обсуждение исходный текст |
Ответ на | Re: more descriptive message for process termination due to max_slot_wal_keep_size (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Ответы |
Re: more descriptive message for process termination due to max_slot_wal_keep_size
|
Список | pgsql-hackers |
Hi,
At Mon, 5 Sep 2022 11:56:33 +0200, "Drouvot, Bertrand" <bdrouvot@amazon.com> wrote inHi, On 3/2/22 7:37 AM, Kyotaro Horiguchi wrote:At Tue, 04 Jan 2022 10:29:31 +0900 (JST), Kyotaro Horiguchi<horikyota.ntt@gmail.com> wrote inSo what do you say if I propose the following? LOG: terminating process %d to release replication slot \"%s\" because its restart_lsn %X/%X exceeds the limit %X/%X HINT: You might need to increase max_slot_wal_keep_size.This version emits the following message. [35785:checkpointer] LOG: terminating process 36368 to release replication slot "s1" because its restart_lsn 0/1F000148 exceeds the limit 0/21000000 [35785:checkpointer] HINT: You might need to increase max_slot_wal_keep_size.As the hint is to increase max_slot_wal_keep_size, what about reporting the difference in size (rather than the limit lsn)? Something along those lines? [35785:checkpointer] LOG: terminating process 36368 to release replication slot "s1" because its restart_lsn 0/1F000148 exceeds the limit by <NNN MB>.Thanks! That might be more sensible exactly for the reason you mentioned. One issue doing that is size_pretty is dbsize.c local function. Since the size is less than kB in many cases, we cannot use fixed unit for that.
Thanks for the new patch version!. I did not realized (sorry about that) that we'd need to expose byte_size_pretty(). Now I wonder if we should not simply report the number of bytes (like I can see it is done in many places). So something like:
@@ -1298,9 +1298,9 @@ InvalidatePossiblyObsoleteSlot(ReplicationSlot *s, XLogRecPtr oldestLSN,
byte_size_pretty(buf, sizeof(buf),
oldestLSN - restart_lsn);
ereport(LOG,
- (errmsg("terminating process %d to release replication slot \"%s\" because its restart_lsn %X/%X exceeds the limit by %s",
+ (errmsg("terminating process %d to release replication slot \"%s\" because its restart_lsn %X/%X exceeds the limit by %lu bytes",
active_pid, NameStr(slotname),
- LSN_FORMAT_ARGS(restart_lsn), buf),
+ LSN_FORMAT_ARGS(restart_lsn), oldestLSN - restart_lsn),
errhint("You might need to increase max_slot_wal_keep_size.")));
and then forget about exposing/using byte_size_pretty() (that would be more consistent with the same kind of reporting in the existing code).
What do you think?
Regards,
-- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
В списке pgsql-hackers по дате отправления: