Re: GiST seems to drop left-branch leaf tuples
От | Oleg Bartunov |
---|---|
Тема | Re: GiST seems to drop left-branch leaf tuples |
Дата | |
Msg-id | Pine.LNX.4.64.1011240907350.12632@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Re: GiST seems to drop left-branch leaf tuples (Peter Tanski <ptanski@raditaz.com>) |
Список | pgsql-hackers |
Peter, glad to know you succeeded. FYI, a year ago we developed GiST extension for rdkit.org. Oleg On Tue, 23 Nov 2010, Peter Tanski wrote: > I found another off-by-one error in my Picksplit() algorithm and the GiST index contains one leaf tuple for each row inthe table now. The error was to start from 1 instead of 0 when assigning the entries. Thanks to everyone for your help. > > For the record, this is the only GiST index I know of where the keys are over 2000 bytes in size. So GiST definitely handleslarge keys. Perhaps the maximum size for intarray could be increased. > > On Nov 23, 2010, at 4:01 PM, Yeb Havinga wrote: > >> On 2010-11-23 20:54, Peter Tanski wrote: >>> On Nov 23, 2010, at 1:37 PM, Yeb Havinga wrote: >>>>>>> j = 0; >>>>>>> for (i = FirstOffsetNumber; i< maxoff; i = OffsetNumberNext(i)) { >>>>>>> FPrint* v = deserialize_fprint(entv[i].key); >>>>>> Isn't this off by one? Offset numbers are 1-based, so the maxoff >>>>>> computation is wrong. >>>> The first for loop of all others compare with i<= maxoff instead of i< maxoff. >>> You are right: I am missing the last one, there. (During a memory-debugging phase entv[entryvec-n - 1] was always invalid,probably as a memory overwrite error but I fixed that later and never changed it back.) >>> >>> On the other hand, there are two problems: >>> >>> 1. the maximum size on a GiST page is 4240 bytes, so I cannot add a full-size Datum using this kind of hash-key setup(the base Datum size is 4230 bytes on a 64-bit machine). The example test cases I used were smaller in order to getaround that issue: they are 2326 bytes base size. >>> >>> 2. Even after fixing the Picksplit() loop, the dropped-leaf problem still manifests itself: >> I noticed an n_entries intialization in one of your earlier mails that might also be a source of trouble. I was underthe impression that gistentryvectors have n-1 entries (not n-2 as you say), because the first element (0 / InvalidOffsetNumber)must be skipped. E.g. entryvec->n = 5. This means that there are 4 entries, which are in array positions1,2,3,4. >> >> btw: interesting topic, audio fingerprinting! >> >> regards, >> Yeb Havinga >> > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-hackers по дате отправления: