Обсуждение: Explain [Analyze] produces parallel scan for select Into table statements.

Поиск
Список
Период
Сортировка

Explain [Analyze] produces parallel scan for select Into table statements.

От
Mithun Cy
Дата:
Hi All,

Explain [Analyze] Select Into table..... produces the plan which uses parallel scans.

Test:

create table table1 (n int);
insert into table1 values (generate_series(1,5000000));
analyze table1;

set parallel_tuple_cost=0;

set max_parallel_degree=3;

postgres=# explain select into table2 from table1;

                                  QUERY PLAN                                   

-------------------------------------------------------------------------------

 Gather  (cost=1000.00..39253.03 rows=5000000 width=0)

   Number of Workers: 3

   ->  Parallel Seq Scan on table1  (cost=0.00..38253.03 rows=1612903 width=0)

(3 rows)

-----------------------------

So Explain Analyze Fails.

postgres=#  explain analyze select into table2 from table1;

ERROR:  cannot insert tuples during a parallel operation

STATEMENT:  explain analyze select into table2 from table1;

But actual execution is successful.

postgres=# select into table2 from table1;

SELECT 5000000

Reason is in ExplainOneQuery we unconditionally 

pass CURSOR_OPT_PARALLEL_OK to pg_plan_query even if query might be from

CreateTableAs/ SelectInto. Whereas in ExecCreateTableAs it is always 0.

Possible Fix:

I tried to make a patch to fix this. Now in ExplainOneQuery if into clause is

defined then parallel plans are disabled as similar to their execution.

--
Thanks and Regards
Mithun C Y

Вложения

Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Amit Kapila
Дата:
On Wed, Mar 9, 2016 at 8:18 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:
>
> Hi All,
>
> Explain [Analyze] Select Into table..... produces the plan which uses parallel scans.
>
> Possible Fix:
>
> I tried to make a patch to fix this. Now in ExplainOneQuery if into clause is
>
> defined then parallel plans are disabled as similar to their execution.
>


- /* plan the query */

- plan = pg_plan_query(query, CURSOR_OPT_PARALLEL_OK, params);

+ /* 

+ * plan the query.

+ * Note: If Explain is for CreateTableAs / SelectInto Avoid parallel

+ *       plans.

+ */

+ plan = pg_plan_query(query, into ? 0:CURSOR_OPT_PARALLEL_OK, params);



There should be a white space between 0:CURSOR_OPT_PARALLEL_OK.  Also I don't see this comment is required as similar other usage doesn't have any such comment.  Fixed these two points in the attached patch.

In general, the patch looks good to me and solves the problem mentioned.  I have ran the regression tests with force_parallel_mode and doesn't see any problem.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Вложения

Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Robert Haas
Дата:
On Thu, Mar 10, 2016 at 4:43 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> There should be a white space between 0:CURSOR_OPT_PARALLEL_OK.  Also I
> don't see this comment is required as similar other usage doesn't have any
> such comment.  Fixed these two points in the attached patch.
>
> In general, the patch looks good to me and solves the problem mentioned.  I
> have ran the regression tests with force_parallel_mode and doesn't see any
> problem.

I guess there must not be an occurrence of this pattern in the
regression tests, or previous force_parallel_mode testing would have
found this problem.  Perhaps this patch should add one?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Mithun Cy
Дата:
On Thu, Mar 10, 2016 at 9:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>I guess there must not be an occurrence of this pattern in the
>regression tests, or previous force_parallel_mode testing would have
>found this problem.  Perhaps this patch should add one?

I have added the test to select_into.sql. Added Explain select into statement.
Explain Analyze produces planning time and execution time even with TIMING OFF
so not adding the same to regress tests.

--
Thanks and Regards
Mithun C Y

Вложения

Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Amit Kapila
Дата:
On Fri, Mar 11, 2016 at 3:34 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:
>
> On Thu, Mar 10, 2016 at 9:39 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >I guess there must not be an occurrence of this pattern in the
> >regression tests, or previous force_parallel_mode testing would have
> >found this problem.  Perhaps this patch should add one?
>
> I have added the test to select_into.sql. Added Explain select into statement.
>

I don't see how this test will fail with force_parallel_mode=regress and max_parallel_degree > 0 even without the patch proposed to fix the issue in hand.  In short, I don't think this test would have caught the issue, so I don't see much advantage in adding such a test.  Even if we want to add such a test case, I think as proposed this will substantially increase the timing for "Select Into" test which might not be an acceptable test case addition especially for testing one corner case.
 
>
> Explain Analyze produces planning time and execution time even with TIMING OFF
> so not adding the same to regress tests.
>

Yeah, that makes the addition of test for this functionality difficult.  Robert, do you have any idea what kind of test would have caught this issue?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Mithun Cy
Дата:


On Sat, Mar 12, 2016 at 12:28 PM, Amit Kapila <amit.kapila16@gmail.com> wrote
>I don't see how this test will fail with force_parallel_mode=regress and max_parallel_degree > 0 even without the patch proposed to fix the issue in >hand.  In short, I don't think this test would have caught the issue, so I don't see much advantage in adding such a test.  Even if we want to add such a >test case, I think as proposed this will substantially increase the timing for "Select Into" test which might not be an acceptable test case addition >especially for testing one corner case.


Without above patch the make installcheck fails for select_into.sql with below diff

when
force_parallel_mode = on
max_parallel_degree = 3

diff results/select_into.out expected/select_into.out

104,110c104,107

< QUERY PLAN

< ------------------------

< Gather

< Number of Workers: 1

< Single Copy: true

< -> Seq Scan on mt1

< (4 rows)

---

> QUERY PLAN

> -----------------

> Seq Scan on mt1

> (1 row)


Again with postgresql.conf non default settings. 

force_parallel_mode = on
max_parallel_degree = 3
parallel_tuple_cost = 0

[mithun@localhost regress]$ diff results/select_into.out expected/select_into.out

104,109c104,107

< QUERY PLAN

< --------------------------------

< Gather

< Number of Workers: 3

< -> Parallel Seq Scan on mt1

< (3 rows)

---

> QUERY PLAN

> -----------------

> Seq Scan on mt1

> (1 row)

To reduce the time of execution I can set the generate_series parameter to 500, which is fast in my machine and also fails with above diff but this time only one worker is assigned as per plan.

--
Thanks and Regards
Mithun C Y

Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Amit Kapila
Дата:
On Sat, Mar 12, 2016 at 2:02 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:
>
>
>
> On Sat, Mar 12, 2016 at 12:28 PM, Amit Kapila <amit.kapila16@gmail.com> wrote
> >I don't see how this test will fail with force_parallel_mode=regress and max_parallel_degree > 0 even without the patch proposed to fix the issue in >hand.  In short, I don't think this test would have caught the issue, so I don't see much advantage in adding such a test.  Even if we want to add such a >test case, I think as proposed this will substantially increase the timing for "Select Into" test which might not be an acceptable test case addition >especially for testing one corner case.
>
>
> Without above patch the make installcheck fails for select_into.sql with below diff
>
> when
> force_parallel_mode = on
> max_parallel_degree = 3
>

With force_parallel_mode=on, I could see many other failures as well.  I think it is better to have test, which tests this functionality with force_parallel_mode=regress


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Mithun Cy
Дата:


On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>With force_parallel_mode=on, I could see many other failures as well.  I think it is better to have test, which tests this functionality with >force_parallel_mode=regress

as per user manual.
Setting this value to regress has all of the same effects as setting it to on plus some additional effect that are intended to facilitate automated regression testing. Normally, messages from a parallel worker are prefixed with a context line, but a setting of regress suppresses this to guarantee reproducible results. Also, the Gather nodes added to plans by this setting are hidden from the EXPLAIN output so that the output matches what would be obtained if this setting were turned off

And my test is for EXPLAIN statements. I think under regress mode it will never fail even if parallel scan is used as per above statement.
--
Thanks and Regards
Mithun C Y

Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Mithun Cy
Дата:


On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>With force_parallel_mode=on, I could see many other failures as well.  I think it is better to have test, which tests this functionality with >force_parallel_mode=regress

as per user manual.
Setting this value to regress has all of the same effects as setting it to on plus some additional effect that are intended to facilitate automated regression testing. Normally, messages from a parallel worker are prefixed with a context line, but a setting of regress suppresses this to guarantee reproducible results. Also, the Gather nodes added to plans by this setting are hidden from the EXPLAIN output so that the output matches what would be obtained if this setting were turned off

And my test is for EXPLAIN statements. I think under regress mode it will never fail even if parallel scan is used as per above statement.
--
Thanks and Regards
Mithun C Y

Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Mithun Cy
Дата:


On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>With force_parallel_mode=on, I could see many other failures as well.  I think it is better to have test, which tests this functionality with >force_parallel_mode=regress

as per user manual.
Setting this value to regress has all of the same effects as setting it to on plus some additional effect that are intended to facilitate automated regression testing. Normally, messages from a parallel worker are prefixed with a context line, but a setting of regress suppresses this to guarantee reproducible results. Also, the Gather nodes added to plans by this setting are hidden from the EXPLAIN output so that the output matches what would be obtained if this setting were turned off

And my test is for EXPLAIN statements. I think under regress mode it will never fail even if parallel scan is used as per above statement.
--
Thanks and Regards
Mithun C Y

Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Mithun Cy
Дата:
Sorry there was some issue with my mail settings same mail got set more than once.
--
Thanks and Regards
Mithun C Y

Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Amit Kapila
Дата:
On Sat, Mar 12, 2016 at 7:11 PM, Mithun Cy <mithun.cy@enterprisedb.com> wrote:
>
>
>
> On Sat, Mar 12, 2016 at 2:32 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >With force_parallel_mode=on, I could see many other failures as well.  I think it is better to have test, which tests this functionality with >force_parallel_mode=regress
>
> as per user manual.
> Setting this value to regress has all of the same effects as setting it to on plus some additional effect that are intended to facilitate automated 
> regression testing.
>

Yes, that is the only reason I mentioned that it better to have a test which can be checked in automated way and I understand that the way you have written test using Explain won't work in automated way, so not sure if it is good idea to add such a test.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Robert Haas
Дата:
On Sat, Mar 12, 2016 at 1:58 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> Yeah, that makes the addition of test for this functionality difficult.
> Robert, do you have any idea what kind of test would have caught this issue?

Yep.  Committed with that test:

DO $$
BEGIN      EXECUTE 'EXPLAIN ANALYZE SELECT * INTO TABLE easi FROM int8_tbl';
END$$;

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Explain [Analyze] produces parallel scan for select Into table statements.

От
Amit Kapila
Дата:
On Tue, Mar 15, 2016 at 5:22 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sat, Mar 12, 2016 at 1:58 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Yeah, that makes the addition of test for this functionality difficult.
> > Robert, do you have any idea what kind of test would have caught this issue?
>
> Yep.  Committed with that test:
>

Nice.  Thanks!

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com