Re: Bitmap index thoughts
От | Heikki Linnakangas |
---|---|
Тема | Re: Bitmap index thoughts |
Дата | |
Msg-id | 45C85C39.5070401@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Bitmap index thoughts (Gavin Sherry <swm@alcove.com.au>) |
Список | pgsql-hackers |
Gavin Sherry wrote: > On Thu, 1 Feb 2007, Bruce Momjian wrote: > >> Where are we on this patch? Does it have performance tests to show >> where it is beneificial? Is it ready to be reviewed? > > Here's an updated patch: > > http://www.alcove.com.au/~swm/bitmap-2007-02-02.patch > > In this patch, I rewrote the index build system. It was fast before for > well clustered data but for poorly clustered data, it was very slow. Now, > it is pretty good for each distribution type. > > I have various test cases but the one which showed bitmap a poor light was > a table of 600M rows. The key to the table had a cardinality of 100,000. > When the table was loaded with keys clustered, the build time was 1000 > seconds with bitmap (2200 with btree). With poorly clustered data (e.g., > the index key was (1, 2, 3, ..., 6000, 1, 2, 3, ...)), the build time for > bitmap was 14000 seconds! > > So, I rewrote this to compress data using HRL encoding (the same scheme we > use in the bitmap AM itself). Now, clustered data is just as fast and > unclustered data is 2000 seconds. > > The select performance at a cardinality of 100,000 is similar to btree but > faster with lower cardinalities. > > Jie also contributed a rewrite of the WAL code to this patch. Not only is > the code faster now, but it handles the notion of incomplete actions -- > like btree and friends do. The executor code still needs some work from me > -- Jie and I have dirtied things up while experimenting -- but we would > really like some review of the code so that this can get squared away > well before the approach of 8.3 feature freeze. > > One of the major deficiencies remaining is the lack of VACUUM support. > Heikki put his hand up for this and I'm holding him to it! ;-) Thanks :). I'll take a look at it. I'm a bit worried that vacuuming can get complicated if an index is in fact an index + a heap + a btree. To remove empty lov items and the entries in the auxiliary heap and b-tree, you need to: 1. Memorize empty lov-items 2. Scan the heap, and mark the heap tuples corresponding the empty lov-items as dead 3. Scan the b-tree, removing pointers to dead heap tuples 4. Remove dead heap tuples 5. Remove empty lov-items Maybe it's possible to call the existing vacuuming code recursively, but it feels quite horrible. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: