Re: disfavoring unparameterized nested loops

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: disfavoring unparameterized nested loops
Дата
Msg-id CAH2-WznOY=SkO4L=DZAZbEC3Jo4fbG_DCY7QOSt7mF=N+eLSjA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: disfavoring unparameterized nested loops  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Sun, Oct 2, 2022 at 3:43 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Sep 30, 2022 at 2:24 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > It's not just that the risks are ludicrously high, of course. The
> > potential benefits must *also* be very low. It's both factors,
> > together.
>
> Hmm, maybe. But it also wouldn't surprise me very much if someone can
> come up with a test case where a nested loop with a single row (or
> maybe no rows) on one side or the other and it's significantly faster
> than any alternative plan.

That's certainly possible, but wouldn't the difference all come from
fixed startup costs? If we're talking about a single row, with a
minimal test case, then the potential downside of this more
conservative strategy might indeed amount to something like a 2x or 3x
slowdown, if we look at it in isolation. But why measure it that way?
I think that absolute differences like milliseconds of execution time
are much more relevant.

Real production databases have many queries with very diverse
characteristics -- there is a lot going on at any given moment. The
proportion of queries that will be affected either way by avoiding
unparamaterized nested loop joins is probably going to be very small.
Nobody is going to notice if only a small subset or all queries are
maybe 1 ms or 2 ms slower. As long as it's well within the margin of
noise in 100% of all cases, it really shouldn't matter.

AFAICT the choice is one of "bounded, low upside versus unbounded,
high downside".

> I believe, though, that even if such cases
> exist, they are probably relatively few in number compared to the
> cases where parameterized nested loops hurt, and the savings are
> probably small compared to the multiple-orders-of-magnitude slowdowns
> that you can get when a nested loop goes bad. But they might still be
> relatively large -- 2x, 3x? -- in absolute terms.

I suspect it won't even matter if disallowing unparamaterized nested
loop joins loses on average.

I am reminded of this:
https://en.wikipedia.org/wiki/St._Petersburg_paradox#Expected_utility_theory

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Justin Pryzby
Дата:
Сообщение: Re: [RFC] building postgres with meson - v13
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Question: test "aggregates" failed in 32-bit machine