Обсуждение: is this explain good or bad???

Поиск
Список
Период
Сортировка

is this explain good or bad???

От
Дата:
Hey guys,

I'm not sure if this is good or bad but when I want to have a record count
of a view it takes at least 75 seconds to complete (PII 1.4Ghz 256Mb RAM).
Also when I do a record count of one single table (same table but not
joined) it just takes a long time. Firt I want to make sure that the DB is
setup correctly before I jump to HW issues.


This is my sq_logfile table
explain select count(*) from sq_logfile;
Aggregate  (cost=35988.07..35988.07 rows=1 width=0) ->  Seq Scan on sq_logfile  (cost=0.00..33493.86 rows=997686
width=0)

This is ,y sq_flogfile view based on sq_flogfile
explain select count(*) from sq_flogile;
Aggregate  (cost=128282.68..128282.68 rows=1 width=40) ->  Hash Join  (cost=8.65..125788.46 rows=997686 width=40)
-> Hash Join  (cost=6.49..105832.58 rows=997686 width=36)             ->  Hash Join  (cost=4.75..88371.34 rows=997686
width=32)                  ->  Hash Join  (cost=3.69..68416.56 rows=997686
 
width=28)                         ->  Hash Join  (cost=2.58..50955.94 rows=997686
width=24)                               ->  Seq Scan on sq_logfile sl
(cost=0.00..33493.86 rows=997686 width=20)                               ->  Hash  (cost=2.26..2.26 rows=126 width=4)
                                 ->  Seq Scan on sq_contenttypes ct
 
(cost=0.00..2.26 rows=126 width=4)                         ->  Hash  (cost=1.09..1.09 rows=9 width=4)
           ->  Seq Scan on sq_requestmethods rm
 
(cost=0.00..1.09 rows=9 width=4)                   ->  Hash  (cost=1.05..1.05 rows=5 width=4)
-> Seq Scan on sq_hierarchycodes hc
 
(cost=0.00..1.05 rows=5 width=4)             ->  Hash  (cost=1.59..1.59 rows=59 width=4)                   ->  Seq Scan
onsq_resultcodes rc  (cost=0.00..1.59
 
rows=59 width=4)       ->  Hash  (cost=1.93..1.93 rows=93 width=4)             ->  Seq Scan on sq_clientaddrfqdn cafqdn
(cost=0.00..1.93
 
rows=93 width=4)



<------------------->
CREATE TABLE sq_logfile ( id SERIAL8, stime NUMERIC(14,3), tstime TIMESTAMP, duration INTEGER, client_addr_dotted INET,
client_addr_fqdn_idINTEGER DEFAULT 0 NOT NULL REFERENCES
 
sq_clientaddrfqdn (id), resultcode_id INTEGER DEFAULT 0 NOT NULL REFERENCES sq_resultcodes(id), requestsize INTEGER,
requestmethod_idINTEGER DEFAULT 0 NOT NULL REFERENCES sq_requestmethods
 
(id), url TEXT, rfc931 TEXT, hierarchycode_id INTEGER DEFAULT 0 NOT NULL REFERENCES sq_hierarchycodes
(id), hierarchycode TEXT, contenttype_id INTEGER DEFAULT 0 NOT NULL REFERENCES sq_contenttypes (id), PRIMARY KEY(id)
);

-- Everything is sorted bu date/time so we need a index???
CREATE INDEX idx_sq_logfile1 ON sq_logfile (tstime);
CREATE INDEX idx_sq_logfile2 ON sq_logfile (tstime, client_addr_dotted);
CREATE INDEX idx_sq_logfile3 ON sq_logfile (tstime, rfc931);



Re: is this explain good or bad???

От
Tomasz Myrta
Дата:
> explain select count(*) from sq_logfile;
Not too helpful.
Better choice is:
explain analyze select * from sq_logfile;

Your explains show that selecting from view is 4 times slower than 
selecting from a table (35988:128282). It is possible.
Anyway counting 1 million rows usualy takes a long time...

Regards,
Tomasz Myrta



Re: is this explain good or bad???

От
Дата:
Here are the explain analyze versions:


Best regards and thanx in advance
Ries


explain analyze select count(*) from sq_logfile;

NOTICE:  QUERY PLAN:
Aggregate  (cost=35988.07..35988.07 rows=1 width=0) (actual
time=71907.64..71907.64 rows=1 loops=1) ->  Seq Scan on sq_logfile  (cost=0.00..33493.86 rows=997686 width=0)
(actual time=12.90..46759.12 rows=997686 loops=1)
Total runtime: 71907.76 msec

explain analyze select count(*) from sq_flogile;
NOTICE:  QUERY PLAN:
Aggregate  (cost=128282.68..128282.68 rows=1 width=40) (actual
time=99338.92..99338.92 rows=1 loops=1) ->  Hash Join  (cost=8.65..125788.46 rows=997686 width=40) (actual
time=34.34..93123.02 rows=997686 loops=1)       ->  Hash Join  (cost=6.49..105832.58 rows=997686 width=36) (actual
time=23.94..78411.62 rows=997686 loops=1)             ->  Hash Join  (cost=4.75..88371.34 rows=997686 width=32)
(actual time=15.68..63115.86 rows=997686 loops=1)                   ->  Hash Join  (cost=3.69..68416.56 rows=997686
width=28) (actual time=12.56..45110.94 rows=997686 loops=1)                         ->  Hash Join  (cost=2.58..50955.94
rows=997686
width=24) (actual time=9.24..23160.42 rows=997686 loops=1)                               ->  Seq Scan on sq_logfile sl
(cost=0.00..33493.86 rows=997686 width=20) (actual time=5.72..11518.14
rows=997686 loops=1)                               ->  Hash  (cost=2.26..2.26 rows=126 width=4)
(actual time=3.46..3.46 rows=0 loops=1)                                     ->  Seq Scan on sq_contenttypes ct
(cost=0.00..2.26 rows=126 width=4) (actual time=2.88..3.17 rows=126 loops=1)                         ->  Hash
(cost=1.09..1.09rows=9 width=4) (actual
 
time=3.21..3.21 rows=0 loops=1)                               ->  Seq Scan on sq_requestmethods rm
(cost=0.00..1.09 rows=9 width=4) (actual time=3.16..3.19 rows=9 loops=1)                   ->  Hash  (cost=1.05..1.05
rows=5width=4) (actual
 
time=3.06..3.06 rows=0 loops=1)                         ->  Seq Scan on sq_hierarchycodes hc
(cost=0.00..1.05 rows=5 width=4) (actual time=3.04..3.05 rows=5 loops=1)             ->  Hash  (cost=1.59..1.59 rows=59
width=4)(actual
 
time=8.20..8.20 rows=0 loops=1)                   ->  Seq Scan on sq_resultcodes rc  (cost=0.00..1.59
rows=59 width=4) (actual time=7.93..8.07 rows=59 loops=1)       ->  Hash  (cost=1.93..1.93 rows=93 width=4) (actual
time=10.34..10.34 rows=0 loops=1)             ->  Seq Scan on sq_clientaddrfqdn cafqdn  (cost=0.00..1.93
rows=93 width=4) (actual time=9.92..10.13 rows=93 loops=1)
Total runtime: 99339.49 msec


-----Oorspronkelijk bericht-----
Van: Tomasz Myrta [mailto:jasiek@klaster.net]
Verzonden: vrijdag 19 september 2003 8:52
Aan: ries@jongert.nl
CC: pgsql-sql@postgresql.org
Onderwerp: Re: [SQL] is this explain good or bad???


> explain select count(*) from sq_logfile;
Not too helpful.
Better choice is:
explain analyze select * from sq_logfile;

Your explains show that selecting from view is 4 times slower than
selecting from a table (35988:128282). It is possible.
Anyway counting 1 million rows usualy takes a long time...

Regards,
Tomasz Myrta