Re: AW: AW: Plans for solving the VACUUM problem
От | Tom Lane |
---|---|
Тема | Re: AW: AW: Plans for solving the VACUUM problem |
Дата | |
Msg-id | 15440.990196536@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | AW: AW: Plans for solving the VACUUM problem (Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>) |
Список | pgsql-hackers |
Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes: > It was my understanding, that the heap xtid is part of the key now, It is not. There was some discussion of doing that, but it fell down on the little problem that in normal index-search cases you *don't* know the heap tid you are looking for. > And in above case, the keys (since identical except xtid) will stick close > together, thus caching will be good. Even without key-collision problems, deleting N tuples out of a total of M index entries will require search costs like this: bulk delete in linear scan way: O(M) I/O costs (read all the pages)O(M log N) CPU costs (lookup each TID in sorted list) successive index probe way: O(N log M) I/O costs for probing indexO(N log M) CPU costs for probing index (key comparisons) For N << M, the latter looks like a win, but you have to keep in mind that the constant factors hidden by the O() notation are a lot different in the two cases. In particular, if there are T indexentries per page, the former I/O cost is really M/T * sequential read cost whereas the latter is N log M * random read cost, yielding a difference in constant factors of probably a thousand or two. You get some benefit in the latter case from caching the upper btree levels, but that's by definition not a large part of the index bulk. So where's the breakeven point in reality? I don't know but I suspect that it's at pretty small N. Certainly far less than one percent of the table, whereas I would think that people would try to schedule VACUUMs at an interval where they'd be reclaiming several percent of the table. So, as I said to Hiroshi, this alternative looks to me like a possible future refinement, not something we need to do in the first version. regards, tom lane
В списке pgsql-hackers по дате отправления: