Re: A thought on Index Organized Tables
От | Kevin Grittner |
---|---|
Тема | Re: A thought on Index Organized Tables |
Дата | |
Msg-id | 4B839D4B020000250002F4F5@gw.wicourts.gov обсуждение исходный текст |
Ответ на | Re: A thought on Index Organized Tables (Simon Riggs <simon@2ndQuadrant.com>) |
Список | pgsql-hackers |
Simon Riggs <simon@2ndQuadrant.com> wrote: > On Mon, 2010-02-22 at 08:51 +0200, Heikki Linnakangas wrote: >> Gokulakannan Somasundaram wrote: > >> > May i get a little clarification on this issue? Will we be >> > supporting the IOT feature in postgres in future? >> >> What seems like the best path to achieve the kind of performance >> benefits that IOTs offer is allowing index-only-scans using the >> visibility map. > > I don't agree with that. Could you explain why you think that > would be the case? It would be a shame to have everybody think you > can solve a problem if it turned out not to be the case. I'd like to be clear on what feature we're discussing. There has been mention of an organization where there is no heap per se, but all columns are stored in the leaf node of one of the table's indexes (which is the structure referred to as a CLUSTERED INDEX in some other popular products). There has been some mention of storing some of the data out-of-line, which could be considered to be already covered by TOAST. I know that one of the things which makes this technique particularly effective with such things as name columns for a clustered index is that these other products store index entries after the first in a page with a length that matches the previous entry and the differing data at the tail, which we don't yet have. Clearly it's not trivial, but there are certainly cases where it can be a big performance win. Besides the obvious issues around having a relation which functions like both an index and a heap (at the leaf level), there are the details of having other indexes point to these leaf nodes, creating and dropping clustered indexes, impact on vacuums, etc. Situations where clustered indexes tended to help: (1) Most access through a particular index -- often one less random read per access. (2) Frequent sequential access through a range of values in an index -- turn random access into mostly sequential. (3) Index values comprise a large portion of each tuple -- avoid redundant storage, reducing disk footprint, thereby improving cache hits. Points 1 and 2 could be covered to some degree by index-only scans, particularly if additional columns are added to indexes to make them "covering indexes". Index-only scans don't help with 3 at all; in fact, adding the additional columns to indexes to allow that optimization tends to work against it. -Kevin
В списке pgsql-hackers по дате отправления: