Re: tsearch refactorings
От | Teodor Sigaev |
---|---|
Тема | Re: tsearch refactorings |
Дата | |
Msg-id | 46DEE3C9.8060805@sigaev.ru обсуждение исходный текст |
Ответ на | Re: tsearch refactorings ("Heikki Linnakangas" <heikki@enterprisedb.com>) |
Ответы |
Re: tsearch refactorings
|
Список | pgsql-patches |
Heikki Linnakangas wrote: > Teodor Sigaev wrote: >>> Ok. Probably easiest to do that by changing the palloc to palloc0 in >>> parse_tsquery. >> and change sizeof to sizeof(QueryItem) > > Do you mean the sizeofs in the memcpys in parse_tsquery? You can't Oops, I meant pallocs in push* function. palloc0 in parse_tsquery is another way. > > BTW, can you explain what the CRC-32 of a value is used for? It looks > like it's used to speed up some operations, by comparing the CRCs before > comparing the values, but I didn't quite figure out how it works. How It's mostly used in GiST indexes - recalculating crc32 every time for each index tuple to be checked is rather expensive. > much of a performance difference does it make? Would hash_any do a > better/cheaper job? crc32 was chosen after testing a lot of hash function. Perl's hash was the fastest, but crc32 makes much less number of collisions. That's interesting for ASCII a lot of functions produce rather small number of collision, but for upper part of table (0x7f-0xff) crc32 was the best. CRC32 has evenly distributed collisions over characters, others - not. > In any case, I think we need to calculate the CRC/hash in tsqueryrecv, > instead of trusting the client. Agreed. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
В списке pgsql-patches по дате отправления: