Re: Performance degradation on concurrent COPY into a single relation in PG16.
От | David Rowley |
---|---|
Тема | Re: Performance degradation on concurrent COPY into a single relation in PG16. |
Дата | |
Msg-id | CAApHDvpwGD=zidmtVxjfKBkYxKKg9eZkSC01+JWb4WCL3WqnSA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Performance degradation on concurrent COPY into a single relation in PG16. (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Performance degradation on concurrent COPY into a single relation in PG16.
|
Список | pgsql-hackers |
On Tue, 25 Jul 2023 at 17:34, Andres Freund <andres@anarazel.de> wrote: > prep: > COPY (SELECT generate_series(1, 2000000) a, (random() * 100000 - 50000)::int b, 3243423 c) TO '/tmp/lotsaints.copy'; > DROP TABLE lotsaints; CREATE UNLOGGED TABLE lotsaints(a int, b int, c int); > > benchmark: > psql -qX -c 'truncate lotsaints' && pgbench -n -P1 -f <( echo "COPY lotsaints FROM '/tmp/lotsaints.copy';") -t 15 > > I disabled turbo mode, pinned the server to a single core of my Xeon Gold 5215: > > HEAD: 812.690 > > your patch: 821.354 > > strtoint from 8692f6644e7: 824.543 > > strtoint from 6b423ec677d^: 806.678 I'm surprised to see the imul version is faster. It's certainly not what we found when working on 6b423ec67. I've been fooling around a bit to try to learn what might be going on. I wrote 2 patches: 1) pg_strtoint_imul.patch: Effectively reverts 6b423ec67 and puts the code how it likely would have looked had I not done that work at all. (I've assumed that the hex, octal, binary parsing would have been added using the overflow multiplication functions just as the base 10 parsing). 2) pg_strtoint_optimize.patch: I've made another pass over the functions with the current overflow checks and optimized the digit checking code so that it can be done in a single check like: if (digit < 10). This can be done by assigning the result of *ptr - '0' to an unsigned char. Anything less than '0' will wrap around and anything above '9' will remain so. I've also removed a few inefficiencies with the isspace checking. We didn't need to do "while (*ptr && isspace(*ptr)). It's fine just to do while (isspace(*ptr)) since '\0' isn't a space char. I also got rid of the isxdigit call. The hexlookup array contains -1 for non-hex chars. We may as well check the digit value is >= 0. Here are the results I get using your test as quoted above: master @ 62e9af4c63f + fix_COPY_DEFAULT.patch latency average = 568.102 ms master @ 62e9af4c63f + fix_COPY_DEFAULT.patch + pg_strtoint_optimize.patch latency average = 531.238 ms master @ 62e9af4c63f + fix_COPY_DEFAULT.patch + pg_strtoint_imul.patch latency average = 559.333 ms (surprisingly also faster on my machine) The optimized version of the pg_strtoint functions wins over the imul patch. Could you test to see if this is also the case in your Xeon machine? > (when I say strtoint from, I did not replace the goto labels, so that part is > unchanged and unrelated) I didn't quite follow this. I've not really studied the fix_COPY_DEFAULT.patch patch. Is there a reason to delay committing that? It would be good to eliminate that as a variable for the current performance regression. David
Вложения
В списке pgsql-hackers по дате отправления: