Re: Yet another fast GiST build
От | Heikki Linnakangas |
---|---|
Тема | Re: Yet another fast GiST build |
Дата | |
Msg-id | c0846e34-8b3a-e1bf-c88e-021eb241a481@iki.fi обсуждение исходный текст |
Ответ на | Re: Yet another fast GiST build (Heikki Linnakangas <hlinnaka@iki.fi>) |
Ответы |
Re: Yet another fast GiST build
|
Список | pgsql-hackers |
On 07/04/2021 09:00, Heikki Linnakangas wrote: > On 08/03/2021 19:06, Andrey Borodin wrote: >> There were numerous GiST-build-related patches in this thread. Yet uncommitted is a patch with sortsupport routines forbtree_gist contrib module. >> Here's its version which needs review. > > Reviewing this now again. One thing caught my eye: > >> +static int >> +gbt_bit_sort_build_cmp(Datum a, Datum b, SortSupport ssup) >> +{ >> + return DatumGetInt32(DirectFunctionCall2(byteacmp, >> + PointerGetDatum(a), >> + PointerGetDatum(b))); >> +} > > That doesn't quite match the sort order used by the comparison > functions, gbt_bitlt and such. The comparison functions compare the bits > first, and use the length as a tie-breaker. Using byteacmp() will > compare the "bit length" first. However, gbt_bitcmp() also uses > byteacmp(), so I'm a bit confused. So, huh? Ok, I think I understand that now. In btree_gist, the *_cmp() function operates on non-leaf values, and *_lt(), *_gt() et al operate on leaf values. For all other datatypes, the leaf and non-leaf representation is the same, but for bit/varbit, the non-leaf representation is different. The leaf representation is VarBit, and non-leaf is just the bits without the 'bit_len' field. That's why it is indeed correct for gbt_bitcmp() to just use byteacmp(), whereas gbt_bitlt() et al compares the 'bit_len' field separately. That's subtle, and 100% uncommented. What that means for this patch is that gbt_bit_sort_build_cmp() should *not* call byteacmp(), but bitcmp(). Because it operates on the original datatype stored in the table. - Heikki
В списке pgsql-hackers по дате отправления: