Re: Auto-vectorization speeds up multiplication of large-precision numerics

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Auto-vectorization speeds up multiplication of large-precision numerics
Дата	7 сентября 2020 г. 16:07:15
Msg-id	1694682.1599494835@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Auto-vectorization speeds up multiplication of large-precision numerics (Amit Khandekar <amitdkhan.pg@gmail.com>)
Ответы	Re: Auto-vectorization speeds up multiplication of large-precision numerics
Список	pgsql-hackers

Дерево обсуждения

Amit Khandekar <amitdkhan.pg@gmail.com> writes:
> On Mon, 7 Sep 2020 at 11:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> BTW, poking at this further, it seems that the patch only really
>> works for gcc.  clang accepts the -ftree-vectorize switch, but
>> looking at the generated asm shows that it does nothing useful.
>> Which is odd, because clang does do loop vectorization.

> Hmm, yeah that's unfortunate. My guess is that the compiler would do
> vectorization only if 'i' is a constant, which is not true for our
> case.

No, they claim to handle variable trip counts, per

https://llvm.org/docs/Vectorizers.html#loops-with-unknown-trip-count

I experimented with a few different ideas such as adding restrict
decoration to the pointers, and eventually found that what works
is to write the loop termination condition as "i2 < limit"
rather than "i2 <= limit".  It took me a long time to think of
trying that, because it seemed ridiculously stupid.  But it works.

            regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Auto-vectorization speeds up multiplication of large-precision numerics