Hi,
Hm, in some cases your patch is better, but in others both the old code
(8692f6644e7) and HEAD beat yours on my machine. TBH, not entirely sure why.
prep:
COPY (SELECT generate_series(1, 2000000) a, (random() * 100000 - 50000)::int b, 3243423 c) TO '/tmp/lotsaints.copy';
DROP TABLE lotsaints; CREATE UNLOGGED TABLE lotsaints(a int, b int, c int);
benchmark:
psql -qX -c 'truncate lotsaints' && pgbench -n -P1 -f <( echo "COPY lotsaints FROM '/tmp/lotsaints.copy';") -t 15
I disabled turbo mode, pinned the server to a single core of my Xeon Gold 5215:
HEAD: 812.690
your patch: 821.354
strtoint from 8692f6644e7: 824.543
strtoint from 6b423ec677d^: 806.678
(when I say strtoint from, I did not replace the goto labels, so that part is
unchanged and unrelated)
IOW, for me the code from 15 is the fastest by a good bit... There's an imul,
sure, but the fact that it sets a flag makes it faster than having to add more
tests and branches.
Looking at a profile reminded me of how silly it is that we are wasting a good
chunk of the time in these isdigit() checks, even though we already rely on on
the ascii values via (via *ptr++ - '0'). I think that's done in the headers
for some platforms, but not others (glibc). And we've even done already for
octal and binary!
Open coding isdigit() gives us:
HEAD: 797.434
your patch: 803.570
strtoint from 8692f6644e7: 778.943
strtoint from 6b423ec677d^: 777.741
It's somewhat odd that HEAD and your patch switch position here...
> - else if (ptr[0] == '0' && (ptr[1] == 'o' || ptr[1] == 'O'))
> + /* process hex digits */
> + else if (ptr[1] == 'x' || ptr[1] == 'X')
> {
>
> firstdigit = ptr += 2;
I find this unnecessarily hard to read. I realize it's been added in
6fcda9aba83, but I don't see a reason to use such a construct here.
I find it somewhat grating how much duplication there now is in this
code due to being repeated for all the bases...
Greetings,
Andres Freund