Re: JIT compiling with LLVM v9.0

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: JIT compiling with LLVM v9.0
Дата
Msg-id b84ae071-0931-e2eb-7207-f8d217246f98@postgrespro.ru
обсуждение исходный текст
Ответ на JIT compiling with LLVM v9.0  (Andres Freund <andres@anarazel.de>)
Ответы Re: JIT compiling with LLVM v9.0  (Andres Freund <andres@anarazel.de>)
Re: JIT compiling with LLVM v9.0  (Merlin Moncure <mmoncure@gmail.com>)
Список pgsql-hackers

On 24.01.2018 10:20, Andres Freund wrote:
> Hi,
>
> I've spent the last weeks working on my LLVM compilation patchset. In
> the course of that I *heavily* revised it. While still a good bit away
> from committable, it's IMO definitely not a prototype anymore.
>
> There's too many small changes, so I'm only going to list the major
> things. A good bit of that is new. The actual LLVM IR emissions itself
> hasn't changed that drastically.  Since I've not described them in
> detail before I'll describe from scratch in a few cases, even if things
> haven't fully changed.
>
>
> == JIT Interface ==
>
> To avoid emitting code in very small increments (increases mmap/mremap
> rw vs exec remapping, compile/optimization time), code generation
> doesn't happen for every single expression individually, but in batches.
>
> The basic object to emit code via is a jit context created with:
>    extern LLVMJitContext *llvm_create_context(bool optimize);
> which in case of expression is stored on-demand in the EState. For other
> usecases that might not be the right location.
>
> To emit LLVM IR (ie. the portabe code that LLVM then optimizes and
> generates native code for), one gets a module from that with:
>    extern LLVMModuleRef llvm_mutable_module(LLVMJitContext *context);
>
> to which "arbitrary" numbers of functions can be added. In case of
> expression evaluation, we get the module once for every expression, and
> emit one function for the expression itself, and one for every
> applicable/referenced deform function.
>
> As explained above, we do not want to emit code immediately from within
> ExecInitExpr()/ExecReadyExpr(). To facilitate that readying a JITed
> expression sets the function to callback, which gets the actual native
> function on the first actual call.  That allows to batch together the
> generation of all native functions that are defined before the first
> expression is evaluated - in a lot of queries that'll be all.
>
> Said callback then calls
>    extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
> which'll emit code for the "in progress" mutable module if necessary,
> and then searches all generated functions for the name. The names are
> created via
>    extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
> currently "evalexpr" and deform" with a generation and counter suffix.
>
> Currently expression which do not have access to an EState, basically
> all "parent" less expressions, aren't JIT compiled. That could be
> changed, but I so far do not see a huge need.

Hi,

As far as I understand generation of native code is now always done for 
all supported expressions and individually by each backend.
I wonder it will be useful to do more efforts to understand when 
compilation to native code should be done and when interpretation is better.
For example many JIT-able languages like Lua are using traces, i.e. 
query is first interpreted  and trace is generated. If the same trace is 
followed more than N times, then native code is generated for it.

In context of DBMS executor it is obvious that only frequently executed 
or expensive queries have to be compiled.
So we can use estimated plan cost and number of query executions as 
simple criteria for JIT-ing the query.
May be compilation of simple queries (with small cost) should be done 
only for prepared statements...

Another question is whether it is sensible to redundantly do expensive 
work (llvm compilation) in all backends.
This question refers to shared prepared statement cache. But even 
without such cache, it seems to be possible to use for library name some 
signature of the compiled expression and allow
to share this libraries between backends. So before starting code 
generation, ExecReadyCompiledExpr can first build signature and check if 
correspondent library is already present.
Also it will be easier to control space used by compiled libraries in 
this case.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: CONSTANT/NOT NULL/initializer properties for plpgsql record variables
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: [PATCH][PROPOSAL] Refuse setting toast.* reloptions when TOASTtable does not exist