Re: Confine vacuum skip logic to lazy_scan_skip
От | Melanie Plageman |
---|---|
Тема | Re: Confine vacuum skip logic to lazy_scan_skip |
Дата | |
Msg-id | CAAKRu_byDppRvNJ+p5kq-SmXZDQuL6L8N6D06pHbO6WnBnanWQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Confine vacuum skip logic to lazy_scan_skip (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: Confine vacuum skip logic to lazy_scan_skip
|
Список | pgsql-hackers |
On Sun, Mar 10, 2024 at 11:01 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > On Mon, Mar 11, 2024 at 5:31 AM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > I have investigated the interaction between > > maintenance_io_concurrency, streaming reads, and the vacuum buffer > > access strategy (BAS_VACUUM). > > > > The streaming read API limits max_pinned_buffers to a pinned buffer > > multiplier (currently 4) * maintenance_io_concurrency buffers with the > > goal of constructing reads of at least MAX_BUFFERS_PER_TRANSFER size. > > > > Since the BAS_VACUUM ring buffer is size 256 kB or 32 buffers with > > default block size, that means that for a fully uncached vacuum in > > which all blocks must be vacuumed and will be dirtied, you'd have to > > set maintenance_io_concurrency at 8 or lower to see the same number of > > reuses (and shared buffer consumption) as master. > > > > Given that we allow users to specify BUFFER_USAGE_LIMIT to vacuum, it > > seems like we should force max_pinned_buffers to a value that > > guarantees the expected shared buffer usage by vacuum. But that means > > that maintenance_io_concurrency does not have a predictable impact on > > streaming read vacuum. > > > > What is the right thing to do here? > > > > At the least, the default size of the BAS_VACUUM ring buffer should be > > BLCKSZ * pinned_buffer_multiplier * default maintenance_io_concurrency > > (probably rounded up to the next power of two) bytes. > > Hmm, does the v6 look-ahead distance control algorithm mitigate that > problem? Using the ABC classifications from the streaming read > thread, I think for A it should now pin only 1, for B 16 and for C, it > depends on the size of the random 'chunks': if you have a lot of size > 1 random reads then it shouldn't go above 10 because of (default) > maintenance_io_concurrency. The only way to get up to very high > numbers would be to have a lot of random chunks triggering behaviour > C, but each made up of long runs of misses. For example one can > contrive a BHS query that happens to read pages 0-15 then 20-35 then > 40-55 etc etc so that we want to get lots of wide I/Os running > concurrently. Unless vacuum manages to do something like that, it > shouldn't be able to exceed 32 buffers very easily. > > I suspect that if we taught streaming_read.c to ask the > BufferAccessStrategy (if one is passed in) what its recommended pin > limit is (strategy->nbuffers?), we could just clamp > max_pinned_buffers, and it would be hard to find a workload where that > makes a difference, and we could think about more complicated logic > later. > > In other words, I think/hope your complaints about excessive pinning > from v5 WRT all-cached heap scans might have also already improved > this case by happy coincidence? I haven't tried it out though, I just > read your description of the problem... I've rebased the attached v10 over top of the changes to lazy_scan_heap() Heikki just committed and over the v6 streaming read patch set. I started testing them and see that you are right, we no longer pin too many buffers. However, the uncached example below is now slower with streaming read than on master -- it looks to be because it is doing twice as many WAL writes and syncs. I'm still investigating why that is. psql \ -c "create table small (a int) with (autovacuum_enabled=false, fillfactor=25);" \ -c "insert into small select generate_series(1,200000) % 3;" \ -c "update small set a = 6 where a = 1;" pg_ctl stop # drop caches pg_ctl start psql -c "\timing on" -c "vacuum (verbose) small" - Melanie
Вложения
В списке pgsql-hackers по дате отправления: