Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Дата	9 июля 2015 г. 03:39:00
Msg-id	15702.1436413118@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this? (andres@anarazel.de (Andres Freund))
Ответы	Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this? Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?
Список	pgsql-performance

Дерево обсуждения

andres@anarazel.de (Andres Freund) writes:
> On 2015-07-08 15:38:24 -0700, Craig James wrote:
>> From my admittedly naive point of view, it's hard to see why any of this
>> matters. I have functions that do purely CPU-intensive mathematical
>> calculations ... you could imagine something like is_prime(N) that
>> determines if N is a prime number. I have eight clients that connect to
>> eight backends. Each client issues an SQL command like, "select
>> is_prime(N)" where N is a simple number.

> I mostly replied to Merlin's general point (additionally in the context of
> plpgsql).

> But I have a hard time seing that postgres would be the bottleneck for a
> is_prime() function (or something with similar characteristics) that's
> written in C where the average runtime is more than, say, a couple
> thousand cyles.  I'd like to see a profile of that.

But that was not the case that Graeme was complaining about.  He's talking
about simple-arithmetic-and-looping written in plpgsql, in a volatile
function that is going to take a new snapshot for every statement, even if
that's only "n := n+1".  So it's going to spend a substantial fraction of
its runtime banging on the ProcArray, and that doesn't scale.  If you
write your is_prime function purely in plpgsql, and don't bother to mark
it nonvolatile, *it will not scale*.  It'll be slow even in single-thread
terms, but it'll be particularly bad if you're saturating a multicore
machine with it.

One of my Salesforce colleagues has been looking into ways that we could
decide to skip the per-statement snapshot acquisition even in volatile
functions, if we could be sure that a particular statement isn't going to
do anything that would need a snapshot.  Now, IMO that doesn't really do
much for properly written plpgsql; but there's an awful lot of bad plpgsql
code out there, and it can make a huge difference for that.

            regards, tom lane

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Hmmm... why does CPU-intensive pl/pgsql code parallelise so badly when queries parallelise fine? Anyone else seen this?