Re: VACUUM memory management
От | Justin Pryzby |
---|---|
Тема | Re: VACUUM memory management |
Дата | |
Msg-id | 20200405202625.GC2228@telsasoft.com обсуждение исходный текст |
Ответ на | Re: VACUUM memory management (Ibrar Ahmed <ibrar.ahmad@gmail.com>) |
Список | pgsql-hackers |
On Wed, Dec 11, 2019 at 09:29:17PM +0500, Ibrar Ahmed wrote: > > Did you modify Claudio's patch or write a totally new one? > > I wrote completely new patch. I tried multiple techniques like using a list > instead of fixed size array which I thought was most suitable here, but > leave that because of conflict with Parallel Vacuum. Using a list will hardly work, or certainly not well, since it needs to be searched by the ambulkdelete callback. > >> If you wrote a totally new one, have you compared your work with > >> Claudio's, to see if he covered > >> anything you might need to cover? > > > > I checked the patch, and it does not do anything special which my patch is > not doing except one thing. The patch is claiming to increase the limit of > 1GB along with that, but I have not touched that. In my case, we are still > under the limit of maintaines_work_mem but allocate memory in chunks. In > that case, you have the leverage to set a big value of maintaness_work_mem > (even if you don't need that) because it will not allocate all the memory > at the start. After spending a bunch of time comparing them, I disagree. Claudio's patch does these: - avoid using multiple chunks if there's no indexes, therefore no need to avoid the high cost of index scans to avoid; - rather than doing an index scan for each chunk (bad), the callback function lazy_tid_reaped() does a custom binary search *over* chunks of different sizes and then *within* each chunk. That's maybe slighly over-engineered, I'm not convinced that's needed (but I thought it was pretty clever), but someone thought that was important. - properly keep track of *total* number of dead tuples, eg for progress reporting, and for prev_dead_count for pages with no dead tuples; - lazy_record_dead_tuple() doubles allocation when running out of space for dead tuples; some people disagree with that (myself included) but I'm including it here since that's what it does. This still seems nontrivial (to me) to adapt to work with parallel query. -- Justin
В списке pgsql-hackers по дате отправления: