Re: Pre-allocating WAL files
От | Bossart, Nathan |
---|---|
Тема | Re: Pre-allocating WAL files |
Дата | |
Msg-id | 265B06BA-7B16-4C1A-BE1A-1451D22A1F83@amazon.com обсуждение исходный текст |
Ответ на | Pre-allocating WAL files (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Pre-allocating WAL files
|
Список | pgsql-hackers |
On 12/25/20, 12:09 PM, "Andres Freund" <andres@anarazel.de> wrote: > When running write heavy transactional workloads I've many times > observed that one needs to run the benchmarks for quite a while till > they get to their steady state performance. The most significant reason > for that is that initially WAL files will not get recycled, but need to > be freshly initialized. That's 16MB of writes that need to synchronously > finish before a small write transaction can even start to be written > out... > > I think there's two useful things we could do: > > 1) Add pg_wal_preallocate(uint64 bytes) that ensures (bytes + > segment_size - 1) / segment_size WAL segments exist from the current > point in the WAL. Perhaps with the number of bytes defaulting to > min_wal_size if not explicitly specified? > > 2) Have checkpointer (we want walwriter to run with low latency to flush > out async commits etc) occasionally check if WAL files need to be > pre-allocated. > > Checkpointer already tracks the amount of WAL that's expected to be > generated till the end of the checkpoint, so it seems like it's a > pretty good candidate to do so. > > To keep checkpointer pre-allocating when idle we could signal it > whenever a record has crossed a segment boundary. > > > With a plain pgbench run I see a 2.5x reduction in throughput in the > periods where we initialize WAL files. I've been exploring this independently a bit and noticed this message. Attached is a proof-of-concept patch for a separate "WAL allocator" process that maintains a pool of WAL-segment-sized files that can be claimed whenever a new segment file is needed. An early version of this patch attempted to spread the I/O like non-immediate checkpoints do, but I couldn't point to any real benefit from doing so, and it complicated things quite a bit. I like the idea of trying to bake this into an existing process such as the checkpointer. I'll admit that creating a new process just for WAL pre-allocation feels a bit heavy-handed, but it was a nice way to keep this stuff modularized. I can look into moving this functionality into the checkpointer process if this is something that folks are interested in. Nathan
Вложения
В списке pgsql-hackers по дате отправления: