Review for GetWALAvailability()
От | Fujii Masao |
---|---|
Тема | Review for GetWALAvailability() |
Дата | |
Msg-id | 8d55969f-ba37-b89a-6494-e9322ccdb35d@oss.nttdata.com обсуждение исходный текст |
Ответы |
Re: Review for GetWALAvailability()
|
Список | pgsql-hackers |
Hi, The document explains that "lost" value that pg_replication_slots.wal_status reports means some WAL files are definitely lost and this slot cannot be used to resume replication anymore. However, I observed "lost" value while inserting lots of records, but replication could continue normally. So I wonder if pg_replication_slots.wal_status may have a bug. wal_status is calculated in GetWALAvailability(), and probably I found some issues in it. keepSegs = ConvertToXSegs(Max(max_wal_size_mb, wal_keep_segments), wal_segment_size) + 1; max_wal_size_mb is the number of megabytes. wal_keep_segments is the number of WAL segment files. So it's strange to calculate max of them. The above should be the following? Max(ConvertToXSegs(max_wal_size_mb, wal_segment_size), wal_keep_segments) + 1 if ((max_slot_wal_keep_size_mb <= 0 || max_slot_wal_keep_size_mb >= max_wal_size_mb) && oldestSegMaxWalSize <= targetSeg) return WALAVAIL_NORMAL; This code means that wal_status reports "normal" only when max_slot_wal_keep_size is negative or larger than max_wal_size. Why is this condition necessary? The document explains "normal means that the claimed files are within max_wal_size". So whatever max_slot_wal_keep_size value is, IMO that "normal" should be reported if the WAL files claimed by the slot are within max_wal_size. Thought? Or, if that condition is really necessary, the document should be updated so that the note about the condition is added. If the WAL files claimed by the slot exceeds max_slot_wal_keep_size but any those WAL files have not been removed yet, wal_status seems to report "lost". Is this expected behavior? Per the meaning of "lost" described in the document, "lost" should be reported only when any claimed files are removed, I think. Thought? Or this behavior is expected and the document is incorrect? BTW, if we want to implement GetWALAvailability() as the document advertises, we can simplify it like the attached POC patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Вложения
В списке pgsql-hackers по дате отправления: