Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
От | Melanie Plageman |
---|---|
Тема | Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode |
Дата | |
Msg-id | CAAKRu_ZLRuzkM3zKogiZAz2hUony37yLY4aaLb8fPf8fgqs5Og@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode (Justin Pryzby <pryzby@telsasoft.com>) |
Ответы |
Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
|
Список | pgsql-hackers |
On Sat, Apr 1, 2023 at 1:57 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Sat, Apr 01, 2023 at 01:29:13PM -0400, Melanie Plageman wrote: > > Hi, > > > > I was just doing some cleanup on the main patch in this set and realized > > that it was missing a few things. One of which is forbidding the > > BUFFER_USAGE_LIMIT with VACUUM FULL since VACUUM FULL does not use a > > BAS_VACUUM strategy. > > > > VACUUM FULL technically uses a bulkread buffer access strategy for > > reading the original relation if its number of blocks is > number of > > shared buffers / 4 (see initscan()). The new rel writing is done using > > smgrextend/write directly and doesn't go through shared buffers. I > > think it is a stretch to try and use the size passed in to VACUUM by > > BUFFER_USAGE_LIMIT for the bulkread strategy ring. > > When you say that it's a stretch, do you mean that it'd be a pain to add > arguments to handful of functions to pass down the setting ? Or that > it's unclear if doing so would be the desirable/needed/intended/expected > behavior ? More that I don't think it makes sense. VACUUM FULL only uses a buffer access strategy (BAS_BULKREAD) for reading the original relation in and not for writing the new one. It has different concerns because its behavior is totally different from regular vacuum. It is not modifying the original buffers (AFAIK) and the amount of WAL it is generating is different. Also, no matter what, the new relation won't be in shared buffers because of VACUUM FULL using the smgr functions directly. So, I think that allowing the two options together is confusing for the user because it seems to imply we can give them some benefit that we cannot. > I wonder if maybe strategy should be configurable in some more generic > way, like a GUC. At one point I had a patch to allow INSERT to use > strategy buffers (not just INSERT SELECT). And that's still pretty > desirable. Also COPY. I've seen load spikes caused by pg_dumping > tables which are just below 25% of shared_buffers. Which is exacerbated > because pg_dump deliberately orders tables by size, so those tables are > dumped one after another, each causing eviction of ~20% of shared > buffers. And exacerbated some more because TOAST don't seem to use a > ring buffer in that case. Yes, it is probably worth exploring how configurable or dynamic Buffer Access Strategies should be for other users (e.g. not just VACUUM). However, since the ring sizes wouldn't be the same for all the different operations, it is probably easier to start with a single kind of operation and go from there. > > I somehow feel like VACUUM (FULL, BUFFER_USAGE_LIMIT 'x') should error > > out instead of silently not using the buffer usage limit, though. > > > > I am looking for others' opinions. > > Sorry, no opinion here :) > > One thing is that it's fine to take something that previously throw an > error and change it to not throw an error anymore. But it's undesirable > to do the opposite. For that reason, there's may be a tendency to add > errors for cases like this. So, I have made it error out when you specify BUFFER_USAGE_LIMIT with VACUUM FULL or VACUUM ONLY_DATABASE_STATS. However, if you specify buffer_usage_limit -1 with either of these options, it will not error out. I don't love this, but I noticed that VACUUM (FULL, PARALLEL 0) does not error out, while VACUUM (FULL, PARALLEL X) where X > 0 does. If I want to error out when BUFFER_USAGE_LIMIT specified at all but still do so at the bottom of ExecVacuum() with the rest of the vacuum option sanity checking, I will probably need to add a flag bit for VacuumParams->options. I was wondering why some "sanity checking" of vacuum options is done in ExecVacuum() and some in vacuum() (it isn't just split by what is applicable to autovacuum and what isn't). I noticed that even in cases where we don't use the strategy object we still made it, which I thought seemed like a bit of a waste and easy to fix. I've added a commit which does not make the BufferAccessStrategy object when VACUUM FULL or VACUUM ONLY_DATABASE_STATS are specified. I noticed that we also don't use the strategy for VACUUM (PROCESS_MAIN false, PROCESS_TOAST false), but it didn't seem worth handling this very specific case, so I didn't. v8 attached has the prohibitions specified above (including for vacuumdb, as relevant) as well as some cleanup, added test cases, and updated documentation. 0001 is essentially unmodified (i.e. I didn't do anything with the other global variable David mentioned). I still have a few open questions: - what the initial value of ring_size for autovacuum should be (see the one remaining TODO in the code) - should ANALYZE allow specifying BUFFER_USAGE_LIMIT since it uses the guc value when that is set? - should INDEX_CLEANUP off cause VACUUM to use shared buffers and disable use of a strategy (like failsafe vacuum) - should we add anything to VACUUM VERBOSE output about the number of reuses of strategy buffers? - Should we make BufferAccessStrategyData non-opaque so that we don't have to add a getter for nbuffers. I could have implemented this in another way, but I don't really see why BufferAccessStrategyData should be opaque - Melanie
Вложения
- v8-0001-remove-global-variable-vac_strategy.patch
- v8-0003-use-shared-buffers-when-failsafe-active.patch
- v8-0005-Add-VACUUM-BUFFER_USAGE_LIMIT-option-and-GUC.patch
- v8-0002-Don-t-make-vacuum-strategy-ring-when-unused.patch
- v8-0004-Rename-Buffer-Access-Strategy-ring_size-nbuffers.patch
- v8-0006-Add-buffer-usage-limit-option-to-vacuumdb.patch
В списке pgsql-hackers по дате отправления: