Обсуждение: A Question about FPW and WAL

Поиск
Список
Период
Сортировка

A Question about FPW and WAL

От
Dawn Yu
Дата:
Dear community:
Recently,we come up with a same problem,when the database make the checkpoint,the size of wal directory increase very fast and  take one hundred percent disk usage,we found the problem of pg wal write amplification is very serious. By analyzing the wal log, in some scenarios, most of the space is wasted when maintaining the full page of the index. Has the community considered modifying this part of the relevant code to make the full page better,The setting of FPW function is divided into two parts, the full-page write of the data record and the full-page write of the index block, and the switch functions can be set separately. By default, index full-page write is disabled. If the index is broken, you can rebuild the index to solve it. Although the security of index data cannot be guaranteed, this setting can greatly reduce the problem of storage space explosion caused by wal write amplification.
The following is some example,we  can find that the FPW of Btree take over 83.84 percent size of the wal log.
Type                                           N      (%)          Record size      (%)             FPI size      (%)        Combined size      (%)
----                                           -      ---          -----------      ---             --------      ---        -------------      ---
XLOG                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Transaction                                 3064 (  9.87)               104176 (  2.67)                    0 (  0.00)                104176 (  0.08)
Storage                                        0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
CLOG                                           0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Database                                       0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Tablespace                                     0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
MultiXact                                      0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
RelMap                                         0 (  0.00)                    0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Standby                                        1 (0.00)                      50 (0.00)                    0 (  0.00)                   50 (0.00)
Heap2                                          0 (  0.00)                     0 (  0.00)                    0 (  0.00)                    0 (  0.00)
Heap                                        3062 (  9.86)                  526664 (  13.49)          0 (  0.00)               526664 (  0.40)
Btree                                      24924 (  80.27)                3273484 (  83.84)      127027284 (100.00)   


Postgres Version: pg14.4
OS: SUSE15SP3
Thank you,
Best
DawnYu.
Вложения

Re: A Question about FPW and WAL

От
Laurenz Albe
Дата:
On Mon, 2023-04-03 at 14:51 +0800, Dawn Yu wrote:
> Recently,we come up with a same problem,when the database make the checkpoint,
> the size of wal directory increase very fast and  take one hundred percent disk
> usage,we found the problem of pg wal write amplification is very serious.
> By analyzing the wal log, in some scenarios, most of the space is wasted when
> maintaining the full page of the index. Has the community considered modifying
> this part of the relevant code to make the full page better,The setting of FPW
> function is divided into two parts, the full-page write of the data record and
> the full-page write of the index block, and the switch functions can be set
> separately.
> By default, index full-page write is disabled. If the index is broken, you can
> rebuild the index to solve it. Although the security of index data cannot be
> guaranteed, this setting can greatly reduce the problem of storage space explosion
> caused by wal write amplification.

I would feel funny about a parameter like that.

There are a couple of things you can do today:

- increase "max_wal_size" and "checkpoint_timeout" to get checkpoints less often

- activate "wal_compression", so that the FPW are compressed.  Best is to use v15,
  where you can use "lz4" or "zstd" compression, which are much faster.

Yours,
Laurenz Albe



Re: A Question about FPW and WAL

От
Rui DeSousa
Дата:


On Apr 3, 2023, at 5:38 AM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

By default, index full-page write is disabled. If the index is broken, you can
rebuild the index to solve it. Although the security of index data cannot be
guaranteed, this setting can greatly reduce the problem of storage space explosion
caused by wal write amplification.

I would feel funny about a parameter like that.

I agree. To allow for indexes to become corrupt by design would be a bad idea. Rebuilding indexes is a not a solution as some index rebuilds takes hours on very large tables. Dealing with identifying corrupted indexes and the issues caused by them would be a nightmare to deal with and not very ACID compliant.

Re: A Question about FPW and WAL

От
Dawn Yu
Дата:
Ok, thanks. 

On Mon, 3 Apr 2023 at 22:30, Rui DeSousa <rui@crazybean.net> wrote:


On Apr 3, 2023, at 5:38 AM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

By default, index full-page write is disabled. If the index is broken, you can
rebuild the index to solve it. Although the security of index data cannot be
guaranteed, this setting can greatly reduce the problem of storage space explosion
caused by wal write amplification.

I would feel funny about a parameter like that.

I agree. To allow for indexes to become corrupt by design would be a bad idea. Rebuilding indexes is a not a solution as some index rebuilds takes hours on very large tables. Dealing with identifying corrupted indexes and the issues caused by them would be a nightmare to deal with and not very ACID compliant.