Re: WIP: Faster Expression Processing v4

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: WIP: Faster Expression Processing v4
Дата	25 марта 2017 г. 22:22:15
Msg-id	5768.1490458935@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: WIP: Faster Expression Processing v4 (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения

More random musing ... have you considered making the jump-target fields
in expressions be relative rather than absolute indexes?  That is,
EEO_JUMP would look like
    op += (stepno); \    EEO_DISPATCH(); \

instead of
    op = &state->steps[stepno]; \    EEO_DISPATCH(); \

I have not carried out a full patch to make this work, but just making
that one change and examining the generated assembly code looks promising.
Instead of this
movslq    40(%r14), %r8salq    $6, %r8addq    24(%rbx), %r8movq    %r8, %r14jmp    *(%r8)

we get this
movslq    40(%r14), %raxsalq    $6, %raxaddq    %rax, %r14jmp    *(%r14)

which certainly looks like it ought to be faster.  Also, the real reason
I got interested in this at all is that with relative jumps, groups of
steps would be position-independent within the steps array, which would
enable some compile-time tricks that seem impractical with the current
definition.

BTW, now that I've spent a bit of time looking at the generated assembly
code, I'm kind of disinclined to believe any arguments about how we have
better control over branch prediction with the jump-threading
implementation.  At least with current gcc (6.3.1 on Fedora 25) at -O2,
what I see is multiple places jumping to the same indirect jump
instruction :-(.  It's not a total disaster: as best I can tell, all the
uses of EEO_JUMP remain distinct.  But gcc has chosen to implement about
40 of the 71 uses of EEO_NEXT by jumping to the same couple of
instructions that increment the "op" register and then do an indirect
jump :-(.

So it seems that we're at the mercy of gcc's whims as to which instruction
dispatches will be distinguishable to the hardware; which casts a very
dark shadow over any benchmarking-based arguments that X is better than Y
for branch prediction purposes.  Compiler version differences are likely
to matter a lot more than anything we do.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Stephen Frost
Дата: 25 марта 2017 г., 22:21:16
Сообщение: Re: Monitoring roles patch

Следующее

От: Stephen Frost
Дата: 25 марта 2017 г., 22:30:11
Сообщение: Re: increasing the default WAL segment size

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: WIP: Faster Expression Processing v4

Предыдущее

Следующее