Re: [HACKERS] <> join selectivity estimate question

Поиск

Список

Период

Сортировка

От	Thomas Munro
Тема	Re: [HACKERS] <> join selectivity estimate question
Дата	18 марта 2017 г. 04:49:13
Msg-id	CAEepm=11BiYUkgXZNzMtYhXh4S3a9DwUP8O+F2_ZPeGzzJFPbw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [HACKERS] <> join selectivity estimate question (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: [HACKERS] <> join selectivity estimate question (Thomas Munro <thomas.munro@enterprisedb.com>) Re: [HACKERS] <> join selectivity estimate question (Dilip Kumar <dilipbalaut@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On Sat, Mar 18, 2017 at 6:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> After a bit more thought, it seems like the bug here is that "the
> fraction of the LHS that has a non-matching row" is not one minus
> "the fraction of the LHS that has a matching row".  In fact, in
> this example, *all* LHS rows have both matching and non-matching
> RHS rows.  So the problem is that neqjoinsel is doing something
> that's entirely insane for semijoin cases.
>
> It would not be too hard to convince me that neqjoinsel should
> simply return 1.0 for any semijoin/antijoin case, perhaps with
> some kind of discount for nullfrac.  Whether or not there's an
> equal row, there's almost always going to be non-equal row(s).
> Maybe we can think of a better implementation but that seems
> like the zero-order approximation.

Right.  If I temporarily hack neqjoinsel() thus:

        result = 1.0 - result;
+
+       if (jointype == JOIN_SEMI)
+               result = 1.0;
+
        PG_RETURN_FLOAT8(result);
 }

... then I obtain sensible row estimates and the following speedups
for TPCH Q21:

  8 workers = 8.3s -> 7.8s
  7 workers = 8.2s -> 7.9s
  6 workers = 8.5s -> 8.2s
  5 workers = 8.9s -> 8.5s
  4 workers = 9.5s -> 9.1s
  3 workers = 39.7s -> 9.9s
  2 workers = 36.9s -> 11.7s
  1 worker  = 38.2s -> 15.0s
  0 workers = 47.9s -> 24.7s

The plan is similar to the good plan from before even at lower worker
counts, but slightly better because the aggregation has been pushed
under the Gather node.  See attached.

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

hacked_q21_4workers.txt

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Andres Freund
Дата: 18 марта 2017 г., 04:44:44
Сообщение: [HACKERS] Introduce expression initialization hook?

Следующее

От: Thomas Munro
Дата: 18 марта 2017 г., 04:59:51
Сообщение: Re: [HACKERS] <> join selectivity estimate question

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] <> join selectivity estimate question

Вложения

Предыдущее

Следующее