Re: [RFC] Minmax indexes

Поиск

Список

Период

Сортировка

От	Jim Nasby
Тема	Re: [RFC] Minmax indexes
Дата	28 июня 2013 г. 17:37:02
Msg-id	51CDC9B3.6000002@nasby.net обсуждение исходный текст
Ответ на	Re: [RFC] Minmax indexes (Claudio Freire <klaussfreire@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On 6/28/13 12:26 PM, Claudio Freire wrote:
> On Fri, Jun 28, 2013 at 2:18 PM, Jim Nasby <jim@nasby.net> wrote:
>> On 6/17/13 3:38 PM, Josh Berkus wrote:
>>>>>
>>>>> Why?  Why can't we just update the affected pages in the index?
>>>>
>>>>>
>>>>> The page range has to be scanned in order to find out the min/max values
>>>>> for the indexed columns on the range; and then, with these data, update
>>>>> the index.
>>>
>>> Seems like you could incrementally update the range, at least for
>>> inserts.  If you insert a row which doesn't decrease the min or increase
>>> the max, you can ignore it, and if it does increase/decrease, you can
>>> change the min/max.  No?
>>>
>>> For updates, things are more complicated.  If the row you're updating
>>> was the min/max, in theory you should update it to adjust that, but you
>>> can't verify that it was the ONLY min/max row without doing a full scan.
>>>    My suggestion would be to add a "dirty" flag which would indicate that
>>> that block could use a rescan next VACUUM, and otherwise ignore changing
>>> the min/max.  After all, the only defect to having min to low or max too
>>> high for a block would be scanning too many blocks.  Which you'd do
>>> anyway with it marked "invalid".
>>
>>
>> If we add a dirty flag it would probably be wise to allow for more than one
>> value so we can do a clock-sweep. That would allow for detecting a range
>> that is getting dirtied repeatedly and not bother to try and re-summarize it
>> until later.
>>
>> Something else I don't think was mentioned... re-summarization should be
>> somehow tied to access activity: if a query will need to seqscan a segment
>> that needs to be summarized, we should take that opportunity to summarize at
>> the same time while pages are in cache. Maybe that can be done in the
>> backend itself; maybe we'd want a separate process.
>
>
> This smells a lot like hint bits and all the trouble they bring.

Possibly; though if that's the case just having a second process do the work would probably eliminate most of that
problem.
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [RFC] Minmax indexes