Re: Auto-vectorization speeds up multiplication of large-precision numerics
От | Tom Lane |
---|---|
Тема | Re: Auto-vectorization speeds up multiplication of large-precision numerics |
Дата | |
Msg-id | 1694682.1599494835@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Auto-vectorization speeds up multiplication of large-precision numerics (Amit Khandekar <amitdkhan.pg@gmail.com>) |
Ответы |
Re: Auto-vectorization speeds up multiplication of large-precision numerics
|
Список | pgsql-hackers |
Amit Khandekar <amitdkhan.pg@gmail.com> writes: > On Mon, 7 Sep 2020 at 11:23, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> BTW, poking at this further, it seems that the patch only really >> works for gcc. clang accepts the -ftree-vectorize switch, but >> looking at the generated asm shows that it does nothing useful. >> Which is odd, because clang does do loop vectorization. > Hmm, yeah that's unfortunate. My guess is that the compiler would do > vectorization only if 'i' is a constant, which is not true for our > case. No, they claim to handle variable trip counts, per https://llvm.org/docs/Vectorizers.html#loops-with-unknown-trip-count I experimented with a few different ideas such as adding restrict decoration to the pointers, and eventually found that what works is to write the loop termination condition as "i2 < limit" rather than "i2 <= limit". It took me a long time to think of trying that, because it seemed ridiculously stupid. But it works. regards, tom lane
В списке pgsql-hackers по дате отправления: