Re: [RFC] Minmax indexes
От | Jim Nasby |
---|---|
Тема | Re: [RFC] Minmax indexes |
Дата | |
Msg-id | 51CDC9B3.6000002@nasby.net обсуждение исходный текст |
Ответ на | Re: [RFC] Minmax indexes (Claudio Freire <klaussfreire@gmail.com>) |
Список | pgsql-hackers |
On 6/28/13 12:26 PM, Claudio Freire wrote: > On Fri, Jun 28, 2013 at 2:18 PM, Jim Nasby <jim@nasby.net> wrote: >> On 6/17/13 3:38 PM, Josh Berkus wrote: >>>>> >>>>> Why? Why can't we just update the affected pages in the index? >>>> >>>>> >>>>> The page range has to be scanned in order to find out the min/max values >>>>> for the indexed columns on the range; and then, with these data, update >>>>> the index. >>> >>> Seems like you could incrementally update the range, at least for >>> inserts. If you insert a row which doesn't decrease the min or increase >>> the max, you can ignore it, and if it does increase/decrease, you can >>> change the min/max. No? >>> >>> For updates, things are more complicated. If the row you're updating >>> was the min/max, in theory you should update it to adjust that, but you >>> can't verify that it was the ONLY min/max row without doing a full scan. >>> My suggestion would be to add a "dirty" flag which would indicate that >>> that block could use a rescan next VACUUM, and otherwise ignore changing >>> the min/max. After all, the only defect to having min to low or max too >>> high for a block would be scanning too many blocks. Which you'd do >>> anyway with it marked "invalid". >> >> >> If we add a dirty flag it would probably be wise to allow for more than one >> value so we can do a clock-sweep. That would allow for detecting a range >> that is getting dirtied repeatedly and not bother to try and re-summarize it >> until later. >> >> Something else I don't think was mentioned... re-summarization should be >> somehow tied to access activity: if a query will need to seqscan a segment >> that needs to be summarized, we should take that opportunity to summarize at >> the same time while pages are in cache. Maybe that can be done in the >> backend itself; maybe we'd want a separate process. > > > This smells a lot like hint bits and all the trouble they bring. Possibly; though if that's the case just having a second process do the work would probably eliminate most of that problem. -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
В списке pgsql-hackers по дате отправления: