Re: [Gmod-schema] Re: performace problem after VACUUM ANALYZE

Поиск
Список
Период
Сортировка
От SLetovsky@aol.com
Тема Re: [Gmod-schema] Re: performace problem after VACUUM ANALYZE
Дата
Msg-id 27.394f051e.2b7eded5@aol.com
обсуждение исходный текст
Ответы Re: [Gmod-schema] Re: performace problem after VACUUM ANALYZE  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
Tom,
       Thanks for your help in this. We have some flexibility in the schema
design; we are going through our first PostGres performance testing now.
Am I correct in interpreting your comments as saying you believe that
if we could lose the OR and the strand constraint PG would probably
use the index properly? There is an alternative representation on the table
that would do that.

Cheers, -Stan

In a message dated 2/14/2003 6:22:15 PM Eastern Standard Time, tgl@sss.pgh.pa.us writes:

Subj: [Gmod-schema] Re: [PERFORM] performace problem after VACUUM ANALYZE
Date: 2/14/2003 6:22:15 PM Eastern Standard Time
From: tgl@sss.pgh.pa.us
To: cain@cshl.org
CC: pgsql-performance@postgresql.org, gmod-schema@lists.sourceforge.net
Sent from the Internet



Scott Cain <cain@cshl.org> writes:
>Here is the query that is causing the problems:

>select distinct f.name,fl.nbeg,fl.nend,fl.strand,f.type_id,f.feature_id
>   from feature f, featureloc fl
>   where
>    fl.srcfeature_id = 1 and
>    ((fl.strand=1  and fl.nbeg <= 393164 and fl.nend >= 390956) OR
>    (fl.strand=-1 and fl.nend <= 393164 and fl.nbeg >= 390956)) and
>    f.feature_id  = fl.feature_id

>[ and the index he'd like it to use is ]

>Index "featureloc_src_strand_beg_end"
>   Column    |   Type  
>---------------+----------
> srcfeature_id | integer
> strand     | smallint
> nbeg      | integer
> nend      | integer
>btree

After fooling with this I see a couple of problems.  One is the
same old cross-datatype-comparison issue that so frequently bites
people: "1" and "-1" are integer constants, and comparing them to
a smallint column isn't an indexable operation.  You need casts.
(Or, forget the "optimization" of storing strand as a smallint.
Depending on your row layout, it's quite likely saving you no space
anyway.)

Problem two is that the optimizer isn't smart enough to recognize that a
query condition laid out in this form should be processed as two
indexscans --- it would possibly have gotten it right if the first index
column had been inside the OR, but not this way.  The upshot is that
when you force it to use index featureloc_src_strand_beg_end, it's
actually only using the srcfeature_id column of the index --- which is
slow of course, and also explains why the optimizer doesn't find that
option very attractive.

I had some success in getting current sources to generate a desirable
plan by doing this:

regression=# explain select distinct *
regression-#  from feature f join featureloc fl on (f.feature_id  = fl.feature_id) where
regression-#   ((fl.srcfeature_id = 1 and fl.strand=1::int2  and fl.nbeg <= 393164 and fl.nend >= 390956) OR
regression(#   (fl.srcfeature_id = 1 and fl.strand=-1::int2 and fl.nend <= 393164 and fl.nbeg >= 390956));

Unique  (cost=34.79..34.85 rows=5 width=50)
  ->  Sort  (cost=34.79..34.80 rows=5 width=50)
     Sort Key: f.name, fl.nbeg, fl.nend, fl.strand
     ->  Hash Join  (cost=9.68..34.73 rows=5 width=50)
        Hash Cond: ("outer".feature_id = "inner".feature_id)
        ->  Seq Scan on feature f  (cost=0.00..20.00 rows=1000 width=36)
        ->  Hash  (cost=9.68..9.68 rows=1 width=14)
           ->  Index Scan using featureloc_src_strand_beg_end, featureloc_src_strand_beg_end on featureloc fl  (cost=0.00..9.68 rows=1 width=14)
              Index Cond: (((srcfeature_id = 1) AND (strand = 1::smallint) AND (nbeg <= 393164) AND (nend >= 390956)) OR ((srcfeature_id = 1) AND (strand = -1::smallint) AND (nbeg >= 390956) AND (nend <= 393164)))
              Filter: (((srcfeature_id = 1) AND (strand = 1::smallint) AND (nbeg <= 393164) AND (nend >= 390956)) OR ((srcfeature_id = 1) AND (strand = -1::smallint) AND (nend <= 393164) AND (nbeg >= 390956)))
(10 rows)

Shoving the join condition into an explicit JOIN clause is a hack, but
it nicely does the job of keeping the WHERE clause as a pristine
OR-of-ANDs structure, so that the optimizer can hardly fail to notice
that that's the preferred canonical form.

I would strongly recommend an upgrade to PG 7.3, both on general
principles and because you can actually see what the indexscan condition
is in EXPLAIN's output.  Before 7.3 you had to grovel through EXPLAIN
VERBOSE to be sure what was really happening with a multicolumn index.

      regards, tom lane


-------------------------------------------------------
This SF.NET email is sponsored by: FREE  SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your  SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
Gmod-schema mailing list
Gmod-schema@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/gmod-schema


В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: performace problem after VACUUM ANALYZE
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [Gmod-schema] Re: performace problem after VACUUM ANALYZE