Re: GiST seems to drop left-branch leaf tuples
От | Peter Tanski |
---|---|
Тема | Re: GiST seems to drop left-branch leaf tuples |
Дата | |
Msg-id | 218BEF96-3524-41EB-A15C-67CA3DAD4B58@raditaz.com обсуждение исходный текст |
Ответ на | GiST seems to drop left-branch leaf tuples (Peter Tanski <ptanski@raditaz.com>) |
Ответы |
Re: GiST seems to drop left-branch leaf tuples
|
Список | pgsql-hackers |
I found another off-by-one error in my Picksplit() algorithm and the GiST index contains one leaf tuple for each row in thetable now. The error was to start from 1 instead of 0 when assigning the entries. Thanks to everyone for your help. For the record, this is the only GiST index I know of where the keys are over 2000 bytes in size. So GiST definitely handleslarge keys. Perhaps the maximum size for intarray could be increased. On Nov 23, 2010, at 4:01 PM, Yeb Havinga wrote: > On 2010-11-23 20:54, Peter Tanski wrote: >> On Nov 23, 2010, at 1:37 PM, Yeb Havinga wrote: >>>>>> j = 0; >>>>>> for (i = FirstOffsetNumber; i< maxoff; i = OffsetNumberNext(i)) { >>>>>> FPrint* v = deserialize_fprint(entv[i].key); >>>>> Isn't this off by one? Offset numbers are 1-based, so the maxoff >>>>> computation is wrong. >>> The first for loop of all others compare with i<= maxoff instead of i< maxoff. >> You are right: I am missing the last one, there. (During a memory-debugging phase entv[entryvec-n - 1] was always invalid,probably as a memory overwrite error but I fixed that later and never changed it back.) >> >> On the other hand, there are two problems: >> >> 1. the maximum size on a GiST page is 4240 bytes, so I cannot add a full-size Datum using this kind of hash-key setup(the base Datum size is 4230 bytes on a 64-bit machine). The example test cases I used were smaller in order to getaround that issue: they are 2326 bytes base size. >> >> 2. Even after fixing the Picksplit() loop, the dropped-leaf problem still manifests itself: > I noticed an n_entries intialization in one of your earlier mails that might also be a source of trouble. I was under theimpression that gistentryvectors have n-1 entries (not n-2 as you say), because the first element (0 / InvalidOffsetNumber)must be skipped. E.g. entryvec->n = 5. This means that there are 4 entries, which are in array positions1,2,3,4. > > btw: interesting topic, audio fingerprinting! > > regards, > Yeb Havinga >
В списке pgsql-hackers по дате отправления: