Re: Bitmap index thoughts

Поиск

Список

Период

Сортировка

От	Heikki Linnakangas
Тема	Re: Bitmap index thoughts
Дата	27 декабря 2006 г. 06:51:41
Msg-id	45924FA8.2010707@enterprisedb.com обсуждение исходный текст
Ответ на	Re: Bitmap index thoughts (Gavin Sherry <swm@linuxworld.com.au>)
Ответы	Re: Bitmap index thoughts
Список	pgsql-hackers

Дерево обсуждения

Gavin Sherry wrote:
> On Tue, 26 Dec 2006, Heikki Linnakangas wrote:
>> for typical bitmap index use cases and most of the needed pages should
>> stay in memory, but could we simplify this? Why do we need the auxiliary
>> heap, couldn't we just store the blk+offset of the LOV item directly in
>> the b-tree index item?
> 
> The problem is, the b-tree code is very much tied to the heap. I don't
> want to modify the b-tree code to make bitmap indexes work (better).
> What's really tempting is to just manage our own balanced tree within the
> bitmap index file(s) itself. It would start from the metapage and simply
> spill to other 'special' index pages when necessary. The problem is, we do
> not have b-tree code generic enough that it would allow us to do this
> trivially -- consider concurrency and WAL in particular, which we
> currently get for free. I guess this is why I've been ignoring this issue
> :-).

Maybe we could reuse the code in ginbtree.c. Looks like Teodor & Oleg 
had the same problem :).

Modifying the nbtree code doesn't seem that difficult either. AFAICS, 
the only places where the heap is from within nbtree code is in index 
building and uniqueness checks.

>> And instead of having separate LOV pages that store a number of LOV
>> items, how about storing each LOV item on a page of it's own, and using
>> the rest of the page to store the last chunk of the bitmap. That would
>> eliminate one page access, but more importantly, maybe we could then get
>> rid of all the bm_last_* attributes in BMLOVItemData that complicate the
>> patch quite a bit, while preserving the performance.
> 
> That's an interesting approach. We would still need a concept of
> last_word, at the very least, and probably last_comp_word for convenience.

Why?

> PS: Another versio of the patch shall be forthcoming shortly. I've been
> working on compressing the data in memory during CREATE INDEX instead of
> just managing arrays of TIDs in memory as we did previously. The array of
> TIDs works great for well clustered data but it stinks for poorly
> clustered data as we approach maintenance_word_mem and have to swap a lot.

Ok, sounds good.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Bitmap index thoughts