Re: TODO item: Allow data to be pulled directly from indexes
От | Bruce Momjian |
---|---|
Тема | Re: TODO item: Allow data to be pulled directly from indexes |
Дата | |
Msg-id | 200808202020.m7KKK3C20520@momjian.us обсуждение исходный текст |
Ответ на | TODO item: Allow data to be pulled directly from indexes (Karl Schnaitter <karlsch@soe.ucsc.edu>) |
Список | pgsql-hackers |
I have added this email's URL to TODO under tuple visibility. --------------------------------------------------------------------------- Karl Schnaitter wrote: > Sometime last year, a discussion started about including visibility > metadata to avoid heap fetches during an index scan: > > http://archives.postgresql.org/pgsql-patches/2007-10/msg00166.php > http://archives.postgresql.org/pgsql-patches/2008-01/msg00049.php > > I think the last discussion on this was in April: > > http://archives.postgresql.org/pgsql-hackers/2008-04/msg00618.php (last > item) > > I have worked with the current patch, and I have some thoughts about > that approach and the approaches listed in the TODO item. The TODO lists > three approaches, in short > > (1) Add a bit for an index tuple that indicates "visible" or "maybe > visible." > (2) Use a per-table bitmap that indicates which pages have at least one > tuple that is not visible to all transactions. > (3) Same as (2) but at the granularity of one bit per table. > > The approach in the patch is different: > > (4) Add transaction ids, etc to the index tuple (totaling 16 bytes) > > I would group (1) & (4) together and (2) & (3) together. I think the > time and space trade-offs are pretty obvious, so I won't waste time on > those. > > (1) & (4) require an UPDATE or DELETE to twiddle the old index tuple. > Tom has noted (in the linked message) that this is not reliable if the > index has any expression-valued columns, because it is not always > possible to find the old index entry. For this reason, the proposed > patch does not keep visibility metadata for indexes on expressions. This > seems like a reasonable limitation --- indexed expressions are just less > efficient. > > The main difference between (1) & (4) is that (1) will sometimes require > heap lookups and (4) never will. Moreover, the heap lookups in (1) will > be difficult for the optimizer to estimate, unless some special > statistics can be maintained for this purpose. > > I should mention there is a major flaw in the patch, because it puts > pointers to HOT tuples in the index, in order to capture the different > transaction ids in the chain. I think this can be fixed by only pointing > to the root of the HOT chain, and setting xmin/xmax to the entire range > of transaction ids spanned by the chain. I'm not sure about all the > details (the ctid and some other bits also need to be set). > > (2) & (3) can work for any index, and they are quite elegant in the way > that the overhead does not change with the number of indexes. The TODO > also notes the benefit of (2) for efficient vacuuming. Thus, I think > that (2) is a great idea in general, but it does not serve the intended > purpose of this TODO item. Once a page gets marked as requiring > visibility checks, it cannot be unmarked until the next VACUUM. The > whole point of this feature is that we are willing to be more proactive > during updates in order to make index access more efficient. > > So in summary, I think that (2) would be nice as a separate feature, > with (1) and (4) being more favorable for index-only scans. The obvious > trouble with (4) is the extra space overhead. There are also issues with > correctness that I mentioned (any thoughts here would be appreciated). > Other than that, I would favor (4) because it offers the most stable > performance. > > Please let me know if you agree/disagree with anything here. I need to > get this feature implemented for my research, but I would also love to > contribute it to the community so your opinions matter a lot. > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
В списке pgsql-hackers по дате отправления: