Обсуждение: Re: [GENERAL] I feel the need for speed. What am I doing wrong?

Поиск
Список
Период
Сортировка

Re: [GENERAL] I feel the need for speed. What am I doing wrong?

От
"Dann Corbit"
Дата:
> -----Original Message-----
> From: Jean-Luc Lachance [mailto:jllachan@nsd.ca]
> Sent: Tuesday, January 07, 2003 2:43 PM
> To: Tom Lane
> Cc: Dann Corbit; Nigel J. Andrews;
> pgsql-hackers@postgresql.org; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] [HACKERS] I feel the need for speed.
> What am I doing wrong?
>
>
> There is a construct that most people forget for that kind of query:
>
> select "RT_REC_KEY", "cnxarraycolumn", "CRC" from a
> except
> select "RT_REC_KEY", "cnxarraycolumn", "CRC" from b;
>
> simple.

I should have mentioned that I am not using the latest version of
PostgreSQL.  I am using 7.1.3.  Perhaps this stuff has been repaired in
newer versions.  Possibly, there is a reason that people forget to use
it (at least on PostgreSQL 7.1.3):

connxdatasync=> SET enable_seqscan = 0;
SET VARIABLE
connxdatasync=>
connxdatasync=> SELECT a."RT_REC_KEY", a."cnxarraycolumn", a."CRC" FROM
connxdatasync-> "CNX_DS_53_SIS_STU_OPT_FEE_TB" a
connxdatasync-> LEFT OUTER JOIN
connxdatasync-> "CNX_DS2_53_SIS_STU_OPT_FEE_TB" b
connxdatasync->         ON ( a."RT_REC_KEY" = b."RT_REC_KEY" AND
a."cnxarraycolumn" = b."cnxarraycolumn")
connxdatasync-> WHERE b.oid IS NULL ;
 RT_REC_KEY | cnxarraycolumn | CRC
------------+----------------+-----
(0 rows)

1:55.12 to complete

connxdatasync=>
connxdatasync=> select "RT_REC_KEY", "cnxarraycolumn", "CRC" from
"CNX_DS_53_SIS_STU_OPT_FEE_TB" a
connxdatasync-> except
connxdatasync-> select "RT_REC_KEY", "cnxarraycolumn", "CRC" from
"CNX_DS2_53_SIS_STU_OPT_FEE_TB" b;
 RT_REC_KEY | cnxarraycolumn | CRC
------------+----------------+-----
(0 rows)

12:55.25 to complete: More than 6 times slower to complete.

connxdatasync=>
connxdatasync=> EXPLAIN
connxdatasync-> select "RT_REC_KEY", "cnxarraycolumn", "CRC" from
"CNX_DS_53_SIS_STU_OPT_FEE_TB" a
connxdatasync-> except
connxdatasync-> select "RT_REC_KEY", "cnxarraycolumn", "CRC" from
"CNX_DS2_53_SIS_STU_OPT_FEE_TB" b;
NOTICE:  QUERY PLAN:

SetOp Except  (cost=202028537.97..202120623.90 rows=1227812 width=24)
  ->  Sort  (cost=202028537.97..202028537.97 rows=12278124 width=24)
        ->  Append  (cost=100000000.00..200225099.24 rows=12278124
width=24)
              ->  Subquery Scan *SELECT* 1
(cost=100000000.00..100112549.62 rows=6139062 width=24)
                    ->  Seq Scan on CNX_DS_53_SIS_STU_OPT_FEE_TB a
(cost=100000000.00..100112549.62 rows=6139062 width=24)
              ->  Subquery Scan *SELECT* 2
(cost=100000000.00..100112549.62 rows=6139062 width=24)
                    ->  Seq Scan on CNX_DS2_53_SIS_STU_OPT_FEE_TB b
(cost=100000000.00..100112549.62 rows=6139062 width=24)

EXPLAIN
connxdatasync=>


Re: [GENERAL] I feel the need for speed. What am I doing wrong?

От
johnnnnnn
Дата:
On Tue, Jan 07, 2003 at 03:10:06PM -0800, Dann Corbit wrote:
> NOTICE:  QUERY PLAN:
> 
> SetOp Except  (cost=202028537.97..202120623.90 rows=1227812 width=24)
>   ->  Sort  (cost=202028537.97..202028537.97 rows=12278124 width=24)
>         ->  Append  (cost=100000000.00..200225099.24 rows=12278124
> width=24)
>               ->  Subquery Scan *SELECT* 1
> (cost=100000000.00..100112549.62 rows=6139062 width=24)
>                     ->  Seq Scan on CNX_DS_53_SIS_STU_OPT_FEE_TB a
> (cost=100000000.00..100112549.62 rows=6139062 width=24)
>               ->  Subquery Scan *SELECT* 2
> (cost=100000000.00..100112549.62 rows=6139062 width=24)
>                     ->  Seq Scan on CNX_DS2_53_SIS_STU_OPT_FEE_TB b
> (cost=100000000.00..100112549.62 rows=6139062 width=24)
> 
> EXPLAIN

Those big round numbers suggest that you haven't run vacuum analyze on
all of your tables. Since PostgreSQL uses a cost-based optimizer, you
do actually have to give it some idea of what things will cost before
it can give you an appropriate plan.

Reference for your version:
http://www14.us.postgresql.org/users-lounge/docs/7.1/reference/sql-vacuum.html

-johnnnnnnnnn


Re: [GENERAL] I feel the need for speed. What am I doing wrong?

От
Tom Lane
Дата:
johnnnnnn <john@phaedrusdeinus.org> writes:
> On Tue, Jan 07, 2003 at 03:10:06PM -0800, Dann Corbit wrote:
>> ->  Seq Scan on CNX_DS_53_SIS_STU_OPT_FEE_TB a
>> (cost=100000000.00..100112549.62 rows=6139062 width=24)

> Those big round numbers suggest that you haven't run vacuum analyze on
> all of your tables.

No; the 100000000.00 is a tipoff that he's set enable_seqscan off, but
the system is using a seqscan anyway because it cannot find any other
plan.

"SET enable_seqscan = off" does not prevent the planner from generating
seqscan plans, it just adds 100000000.00 to the cost estimate.  That
will generally cause the planner to pick another plan --- if it can find
one.  In this case it evidently cannot find any indexscan alternative.
        regards, tom lane