Re: Vacuum: allow usage of more than 1GB of work mem
От | Heikki Linnakangas |
---|---|
Тема | Re: Vacuum: allow usage of more than 1GB of work mem |
Дата | |
Msg-id | f8d1dd03-c632-86fa-15cf-6f5633350b78@iki.fi обсуждение исходный текст |
Ответ на | Re: Vacuum: allow usage of more than 1GB of work mem (Claudio Freire <klaussfreire@gmail.com>) |
Ответы |
Re: Vacuum: allow usage of more than 1GB of work mem
|
Список | pgsql-hackers |
On 03/04/18 17:20, Claudio Freire wrote: > Ok, rebased patches attached Thanks! I took a look at this. First, now that the data structure is more complicated, I think it's time to abstract it, and move it out of vacuumlazy.c. The Tid Map needs to support the following operations: * Add TIDs, in order (in 1st phase of vacuum) * Random lookup, by TID (when scanning indexes) * Iterate through all TIDs, in order (2nd pass over heap) Let's add a new source file to hold the code for the tid map data structure, with functions corresponding those operations. I took a stab at doing that, and I think it makes vacuumlazy.c nicer. Secondly, I'm not a big fan of the chosen data structure. I think the only reason that the segmented "multi-array" was chosen is that each "segment" works is similar to the simple array that we used to have. After putting it behind the abstraction, it seems rather ad hoc. There are many standard textbook data structures that we could use instead, and would be easier to understand and reason about, I think. So, I came up with the attached patch. I used a B-tree as the data structure. Not sure it's the best one, I'm all ears for suggestions and bikeshedding on alternatives, but I'm pretty happy with that. I would expect it to be pretty close to the simple array with binary search in performance characteristics. It'd be pretty straightforward to optimize further, and e.g. use a bitmap of OffsetNumbers or some other more dense data structure in the B-tree leaf pages, but I resisted doing that as part of this patch. I haven't done any performance testing of this (and not much testing in general), but at least the abstraction seems better this way. Performance testing would be good, too. In particular, I'd like to know how this might affect the performance of lazy_tid_reaped(). That's a hot spot when vacuuming indexes, so we don't want to add any cycles there. Was there any ready-made test kits on that in this thread? I didn't see any at a quick glance, but it's a long thread.. - Heikki
Вложения
В списке pgsql-hackers по дате отправления: