Re: WIP: Covering + unique indexes.

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: WIP: Covering + unique indexes.
Дата
Msg-id CAPpHfduZrxM81gQ5jRqL5+EMUAmfb8GXDgOz9tKGrFttFw4OPg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: WIP: Covering + unique indexes.  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: Re: WIP: Covering + unique indexes.  (David Steele <david@pgmasters.net>)
Re: WIP: Covering + unique indexes.  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Sun, Mar 25, 2018 at 1:47 AM, Peter Geoghegan <pg@bowt.ie> wrote:
I was going to say that you could just treat the low bit in the t_tid
offset as representing "see catalog entry". My first idea was that
nothing would have to change about the existing format, since internal
page items already have only the low bit set within their offset.
However, I now see that that won't really work, because we don't
change the offset in high keys when they're copied from a real item
during a page split. Whatever we do, it has to work equally well for
all "separator keys" -- that is, it must work for both downlinks in
internal pages, and all high keys (including high keys at the leaf
level).

OK.
 
A good solution is to use the unused 13th t_bit. If hash can have a
INDEX_MOVED_BY_SPLIT_MASK, then nbtree can have a INDEX_ALT_TID_MASK.
This avoids a BTREE_VERSION bump, and allows us to deal with the
highkey offset issue. Actually, it's even more flexible than that --
it can work with ordinary leaf tuples in the future, too. That is, we
can eventually implement prefix truncation and deduplication at the
leaf level using this representation, since there is nothing that
limits INDEX_ALT_TID_MASK IndexTuples to "separator keys".

The main difference between this approach to leaf prefix
truncation/compression/deduplication, and the GIN entry tree's posting
list representation would be that it wouldn't have to be
super-optimized for duplicates, at the expense of more common case for
regular nbtree indexes -- having few or no duplicates. A btree_gin
index on pgbench_accounts(aid) looks very similar to an equivalent
nbtree index if you just compare internal pages from each, but they
look quite different at the leaf level, where GIN has 24 byte
IndexTuples instead of 16 bytes IndexTuples. Of course, this is
because the leaf pages have posting lists that can never be simple
heap pointer TIDs.
 
Right, btree_gin is much smaller than regular btree when there are a lot
of duplicates.  When there is no duplicates then btree_gin becomes larger
than regular btree, because gin stores single item pointer less compact
than btree.

A secondary goal of this INDEX_ALT_TID_MASK representation should be
that it won't even be necessary to know that an IndexTuple is
contained within a leaf page rather than an index page (again, unlike
GIN). I'm pretty confident that we can have a truly universal
IndexTuple representation for nbtree, while supporting all of these
standard optimizations.

Sorry for going off in a tangent, but I think it's somewhat necessary
to have a strategy here. Of course, we don't have to get everything
right now, but we should be looking in this direction whenever we talk
about on-disk nbtree changes.

So, as I get you're proposing to introduce INDEX_ALT_TID_MASK flag
which would indicate that we're storing something special in the t_tid
offset.  And that should help us not only for covering indexes, but also for
further btree enhancements including suffix truncation.  What exactly do
you propose to store into t_tid offset when INDEX_ALT_TID_MASK flag
is set?  Is it number of attributes in this particular index tuple?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: [HACKERS] MERGE SQL Statement for PG11
Следующее
От: Dean Rasheed
Дата:
Сообщение: Re: [HACKERS] PATCH: multivariate histograms and MCV lists