Re: speed up verifying UTF-8
От | Heikki Linnakangas |
---|---|
Тема | Re: speed up verifying UTF-8 |
Дата | |
Msg-id | 2f95e70d-4623-87d4-9f24-ca534155f179@iki.fi обсуждение исходный текст |
Ответ на | Re: speed up verifying UTF-8 (John Naylor <john.naylor@enterprisedb.com>) |
Ответы |
Re: speed up verifying UTF-8
|
Список | pgsql-hackers |
On 29/06/2021 14:20, John Naylor wrote: > I still wasn't quite happy with the churn in the regression tests, so > for v13 I gave up on using both the existing utf8 table and my new one > for the "padded input" tests, and instead just copied the NUL byte test > into the new table. Also added a primary key to make sure the padded > test won't give weird results if a new entry has a duplicate description. > > I came up with "highbit_carry" as a more descriptive variable name than > "x", but that doesn't matter a whole lot. > > It also occurred to me that if we're going to check one 8-byte chunk at > a time (like v12 does), maybe it's only worth it to load 8 bytes at a > time. An earlier version did this, but without the recent tweaks. The > worst-case scenario now might be different from the one with 16-bytes, > but for now just tested the previous worst case (mixed2). I tested the new worst case scenario on my laptop: gcc master: chinese | mixed | ascii | mixed16 | mixed8 ---------+-------+-------+---------+-------- 1311 | 758 | 405 | 583 | 725 gcc v13: chinese | mixed | ascii | mixed16 | mixed8 ---------+-------+-------+---------+-------- 956 | 472 | 160 | 572 | 939 mixed16 is the same as "mixed2" in the previous rounds, with '123456789012345ä' as the repeating string, and mixed8 uses '1234567ä', which I believe is the worst case for patch v13. So v13 is somewhat slower than master in the worst case. Hmm, there's one more simple trick we can do: We can have a separate fast-path version of the loop when there are at least 8 bytes of input left, skipping all the length checks. With that: gcc v14: chinese | mixed | ascii | mixed16 | mixed8 ---------+-------+-------+---------+-------- 737 | 412 | 94 | 476 | 725 All the above numbers were with gcc 10.2.1. For completeness, with clang 11.0.1-2 I got: clang master: chinese | mixed | ascii | mixed16 | mixed8 ---------+-------+-------+---------+-------- 1044 | 724 | 403 | 930 | 603 (1 row) clang v13: chinese | mixed | ascii | mixed16 | mixed8 ---------+-------+-------+---------+-------- 596 | 445 | 79 | 417 | 715 (1 row) clang v14: chinese | mixed | ascii | mixed16 | mixed8 ---------+-------+-------+---------+-------- 600 | 337 | 93 | 318 | 511 Attached is patch v14 with that optimization. It needs some cleanup, I just hacked it up quickly for performance testing. - Heikki
Вложения
В списке pgsql-hackers по дате отправления: