Re: neqjoinsel versus "refresh materialized view concurrently"

Поиск

Список

Период

Сортировка

От	Jeff Janes
Тема	Re: neqjoinsel versus "refresh materialized view concurrently"
Дата	14 марта 2018 г. 04:56:30
Msg-id	CAMkU=1ywGwT9n0dAVC+8Ssys3CjtcVak2WU6U_Up4-ocAOx=PQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: neqjoinsel versus "refresh materialized view concurrently" (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы	Re: neqjoinsel versus "refresh materialized view concurrently"
Список	pgsql-hackers

Дерево обсуждения

On Tue, Mar 13, 2018 at 4:57 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

On Wed, Mar 14, 2018 at 12:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
>> There is a fundamental and complicated estimation problem lurking here
>> of course and I'm not sure what to think about that yet. Maybe there
>> is a very simple fix for this particular problem:
>
> Ah, I see you thought of the same hack I did.
>
> I think this may actually be a good fix, and here's the reason: this plan
> is in fact being driven entirely off planner default estimates, because
> we don't have any estimation code that knows what to do with
> "wholerowvar *= wholerowvar". I'm suspicious that we could drop the
> preceding ANALYZE as being a waste of cycles, except maybe it's finding
> out the number of rows for us. In any case, LIMIT 1 is only a good idea
> to the extent that the planner knows what it's doing, and this is an
> example where it demonstrably doesn't and won't any time soon.

Hmm. I wonder if the ANALYZE might have been needed to avoid the
nested loop plan at some point in history.

Here's a patch to remove LIMIT 1, which fixes the plan for Jeff's test
scenario and some smaller and larger examples I tried. The query is
already executed with SPI_execute(..., 1) so it'll give up after one
row anyway. The regression test includes a case that causes a row to
be produced here and that's passing ('ERROR: new data for
materialized view "mvtest_mv" contains duplicate rows without any null
columns').

Is there any good way to make the regression tests fail if the plan reverts to the bad one? The only thing I can think of would be to make the table bigger so the regression tests becomes "noticeably slower", but that is pretty vague and not user friendly to formally pass and just hope it is slow enough for someone to investigate.

Cheers,

Jeff

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: neqjoinsel versus "refresh materialized view concurrently"