Re: [HACKERS] Index/Function organized table layout (from Re:
От | Hannu Krosing |
---|---|
Тема | Re: [HACKERS] Index/Function organized table layout (from Re: |
Дата | |
Msg-id | 1065258004.2746.30.camel@fuji.krosing.net обсуждение исходный текст |
Ответ на | Re: count(*) slow on large tables (Christopher Browne <cbbrowne@libertyrms.info>) |
Ответы |
COUNT(*) again (was Re: [HACKERS] Index/Function organized table layout)
Uses for Index/Function organizing |
Список | pgsql-performance |
Christopher Browne kirjutas R, 03.10.2003 kell 00:57: > jllachan@nsd.ca (Jean-Luc Lachance) writes: > > That's one of the draw back of MVCC. > > I once suggested that the transaction number and other house keeping > > info be included in the index, but was told to forget it... > > It would solve once and for all the issue of seq_scan vs index_scan. > > It would simplify the aggregate problem. > > It would only simplify _one_ case, namely the case where someone cares > about the cardinality of a relation, and it would do that at > _considerable_ cost. > > A while back I outlined how this would have to be done, and for it to > be done efficiently, it would be anything BUT simple. Could this be made a TODO item, perhaps with your attack plan. Of course as strictly optional feature useful only for special situations (see below) I cross-post this to [HACKERS] as it seem relevant to a problem recently discussed there. > It would be very hairy to implement it correctly, and all this would > cover is the single case of "SELECT COUNT(*) FROM SOME_TABLE;" Not really. Just yesterday there was a discussion on [HACKERS] about implementing btree-organized tables, which would be much less needed if the visibility info were kept in indexes. > If you had a single WHERE clause attached, you would have to revert to > walking through the tuples looking for the ones that are live and > committed, which is true for any DBMS. If the WHERE clause could use the same index (or any index with visibility info) then there would be no need for "walking through the tuples" in data relation. the typical usecase cited on [HACKERS] was time series data, where inserts are roughly in (timestamp,id)order but queries in (id,timestamp) order. Now if the index would include all relevant fields (id,timestamp,data1,data2,...,dataN) then the query could run on index only touching just a few pages and thus vastly improving performance. I agree that this is not something everybody needs, but when it is needed it is needed bad. > And it still begs the same question, of why the result of this query > would be particularly meaningful to anyone. I don't see the > usefulness; I don't see the value of going to the considerable effort > of "fixing" this purported problem. Being able to do fast count(*) is just a side benefit. ---------------- Hannu
В списке pgsql-performance по дате отправления: