Re: clustered indexes?
От | Alvaro Herrera |
---|---|
Тема | Re: clustered indexes? |
Дата | |
Msg-id | Pine.LNX.4.44.0206221241060.2135-100000@cm-lcon-46-187.cm.vtr.net обсуждение исходный текст |
Ответ на | Re: clustered indexes? (Curt Sampson <cjs@cynic.net>) |
Список | pgsql-general |
Curt Sampson dijo: > BTW, clustered indexes would be a *really* cool feature to have in > postgres. I recently saw a query go from 70 seconds down to .6 seconds > when I clustered the table on the particular column I was selecting a > value from. I don't know if the storage manager allows it, but maybe it is possible to use the free space map to allocate some free space on each page; that way, anytime a new tuple is added, it is written on the page that correspond to its clustered value (here you add the overhead of checking whether a given relation is clustered or not to _every_ relation). If there is a lot of tuples for a particular value, it should just allocate free space on the last page that contains tuples of that value. OTOH, if there are a lot of values with small number of tuples in one page, it allocates space for all of them at the end of the page. Another thing would be that if one value has tuples that fit in an integer number of pages, a page next to it would have to be allocated for free space also. Of course, when running out of space for a particular value, some more space would have to be allocated, moving all the tuples in the rest of the table. Looks quite inefficient. Or if the clustered values do not have also to be ordered, maybe just the tuples of the next value have to be moved to the end of the table, freeing all the space they were using (but this is a really bad idea if the naxt value has a lot of tuples). Another way to do this would be having some dummy tuples after each values' real tuples. When something wants to write a new tuple, it uses one of the dummy tuples. Variable size tuples would be a problem, of course. Limiting the feature to only fixed-size-tuple-tables simplifies the problem somewhat. All this sounds pretty difficult to me, and I wonder whether it's worth the trouble. Maybe fixing the problem with the current CLUSTER implementation (the dropping of indexes and grant permissions) helps somewhat, but seeing the comments in CVS log from Tom Lane does not help: revision 1.63 date: 2001/01/12 01:22:21; author: tgl; state: Exp; lines: +2 -2 Preserve constraints and column defaults during CLUSTER. Wish they were all this easy ... -- Alvaro Herrera (<alvherre[a]atentus.com>) "Hay dos momentos en la vida de un hombre en los que no debería especular: cuando puede permitírselo y cuando no puede" (Mark Twain)
В списке pgsql-general по дате отправления: