Re: speed up unicode normalization quick check
От | Mark Dilger |
---|---|
Тема | Re: speed up unicode normalization quick check |
Дата | |
Msg-id | 74AA9D41-8B59-41E6-941C-DA6B92A603DA@enterprisedb.com обсуждение исходный текст |
Ответ на | speed up unicode normalization quick check (John Naylor <john.naylor@2ndquadrant.com>) |
Ответы |
Re: speed up unicode normalization quick check
|
Список | pgsql-hackers |
> On May 21, 2020, at 12:12 AM, John Naylor <john.naylor@2ndquadrant.com> wrote: > > Hi, > > Attached is a patch to use perfect hashing to speed up Unicode > normalization quick check. > > 0001 changes the set of multipliers attempted when generating the hash > function. The set in HEAD works for the current set of NFC codepoints, > but not for the other types. Also, the updated multipliers now all > compile to shift-and-add on most platform/compiler combinations > available on godbolt.org (earlier experiments found in [1]). The > existing keyword lists are fine with the new set, and don't seem to be > very picky in general. As a test, it also successfully finds a > function for the OS "words" file, the "D" sets of codepoints, and for > sets of the first n built-in OIDs, where n > 5. Prior to this patch, src/tools/gen_keywordlist.pl is the only script that uses PerfectHash. Your patch adds a second. I'mnot convinced that modifying the PerfectHash code directly each time a new caller needs different multipliers is the rightway to go. Could you instead make them arguments such that gen_keywordlist.pl, generate-unicode_combining_table.pl,and future callers can pass in the numbers they want? Or is there some advantage tohaving it this way? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: