Обсуждение: Why would this slow the query down so much?
I have 3 tables that I am trying to join together:
------------------------------------------ Table "caturljoin"Attribute | Type |
-----------+---------+category | integer |url | integer |
Index: caturljoin_url caturljoin_cat
Table "stories" Attribute | Type |
-------------+------------------------+urn | integer |headline | character varying
|author | character varying |source | integer |story | text |added
| date |description | character varying |displayall | smallint |fullurl
|character varying(255) |publish | smallint |error | integer |sourceurl |
charactervarying(255) |sourcename | character varying(100) |rank | smallint |
Indices: stories_added, stories_source, stories_unique_story, stories_urn_key
Table "urllist" Attribute | Type |
--------------+------------------------+urn | integer |url | character varying(255)
|friendlyname| character varying(30) |homepage | character varying(255) |method | smallint
|script | character varying(20) |params | character varying(500) |collect | smallint
|section | smallint |index_script | character varying |regexp | character varying(100)
|baseurl | character varying(75) |
Index: urllist_urn
------------------------------------------
With the following SQL:
------------------------------------------
SELECT a.category, b.headline, b.added, c.friendlyname
FROM caturljoin as a INNER JOIN stories as b ON (a.url = b.source) INNER JOIN urllist as c ON (a.url =
d.urn)
WHERE a.category = 93 ORDER BY b.added DESC LIMIT 1;
------------------------------------------
The results of explain for the above are:
------------------------------------------
psql:scratch.sql:5: NOTICE: QUERY PLAN:
Limit (cost=1587.30..1587.30 rows=1 width=44) -> Sort (cost=1587.30..1587.30 rows=1 width=44) -> Merge Join
(cost=249.89..1587.29rows=1 width=44) -> Sort (cost=249.89..249.89 rows=409 width=28)
-> Nested Loop (cost=0.00..232.15 rows=409 width=28) -> Index Scan using caturljoin_cat on
caturljoin
a (cost=0.00..5.09 rows=7 width=8) -> Index Scan using stories_source on stories b
(cost=0.00..34.41 rows=29 width=20) -> Index Scan using urllist_urn on urllist c
(cost=0.00..1323.69 rows=505 width=16)
EXPLAIN
------------------------------------------
and as you might be able to guess the query takes an age to complete.
If I remove the table urllist from the query, I get a much better response:
------------------------------------------
psql:scratch.sql:4: NOTICE: QUERY PLAN:
Limit (cost=0.00..207.74 rows=1 width=28) -> Nested Loop (cost=0.00..84945.18 rows=409 width=28) -> Index
ScanBackward using stories_added on stories b
(cost=0.00..2310.04 rows=16149 width=20) -> Index Scan using caturljoin_url on caturljoin a
(cost=0.00..5.10 rows=1 width=8)
EXPLAIN
------------------------------------------
Currently the tables contain the following rows of data:
------------------------------------------
caturljoin: 653 rows
urllist: 505 rows
stories: 21554 rows
------------------------------------------
Can anyone tell me why the inclusion of urllist would slow it down so much,
and what can I do to improve the speed of the query?
Stuart Grimshaw <nospam@smgsystems.co.uk> writes:
> SELECT a.category, b.headline, b.added, c.friendlyname
> FROM caturljoin as a
> INNER JOIN stories as b ON (a.url = b.source)
> INNER JOIN urllist as c ON (a.url = d.urn)
> WHERE a.category = 93 ORDER BY b.added DESC LIMIT 1;
(I assume "d.urn" is a typo for "c.urn"...)
The query plan you show looks pretty reasonable if the planner's row
count estimates are in the right ballpark. How many caturljoin rows
have category = 93? How many stories rows will match each caturljoin
row? How many urllist rows ditto?
regards, tom lane
Stuart, > ------------------------------------------ > SELECT a.category, b.headline, b.added, c.friendlyname > FROM caturljoin as a > INNER JOIN stories as b ON (a.url = b.source) > INNER JOIN urllist as c ON (a.url = d.urn) > WHERE a.category = 93 ORDER BY b.added DESC LIMIT 1; > ------------------------------------------ Hmmm? Where did table "d" come from? If this is really your query, that's your problem. -Josh ______AGLIO DATABASE SOLUTIONS___________________________ Josh Berkus Complete informationtechnology josh@agliodbs.com and data management solutions (415) 565-7293 for law firms, small businesses fax 621-2533 and non-profit organizations. San Francisco
On Monday 15 October 2001 16:12 pm, Tom Lane wrote: > Stuart Grimshaw <nospam@smgsystems.co.uk> writes: > > SELECT a.category, b.headline, b.added, c.friendlyname > > FROM caturljoin as a > > INNER JOIN stories as b ON (a.url = b.source) > > INNER JOIN urllist as c ON (a.url = d.urn) > > WHERE a.category = 93 ORDER BY b.added DESC LIMIT 1; > > (I assume "d.urn" is a typo for "c.urn"...) > > The query plan you show looks pretty reasonable if the planner's row > count estimates are in the right ballpark. How many caturljoin rows > have category = 93? How many stories rows will match each caturljoin > row? How many urllist rows ditto? There are 194 rows in caturljoin where url = 93, 29806 rows in stories will match those 194 rows and only 1 row in urllist will match. -- | Stuart Grimshaw <stuart@footballnet.com> | Chief Operations Officer | Football Networks Ltd |- | t:07976 625221 | f:0870 7060260
On Tue, 16 Oct 2001 17:58:32 +0100 Stuart Grimshaw wrote: > On Monday 15 October 2001 16:12 pm, Tom Lane wrote: > > Stuart Grimshaw <nospam@smgsystems.co.uk> writes: > > > SELECT a.category, b.headline, b.added, c.friendlyname > > > FROM caturljoin as a > > > INNER JOIN stories as b ON (a.url = b.source) > > > INNER JOIN urllist as c ON (a.url = d.urn) > > > WHERE a.category = 93 ORDER BY b.added DESC LIMIT 1; > > > > (I assume "d.urn" is a typo for "c.urn"...) > > > > The query plan you show looks pretty reasonable if the planner's row > > count estimates are in the right ballpark. How many caturljoin rows > > have category = 93? How many stories rows will match each caturljoin > > row? How many urllist rows ditto? > > There are 194 rows in caturljoin where url = 93, 29806 rows in stories will > match those 194 rows and only 1 row in urllist will match. > If it's convenient, would you try to delete some indices of the "stories" table? the total number of sorts on theQUERY PLANmight decrease. However, this trial may be a vain effort. I can't expect the result of the QUERY PLAN. :-)The indices:"stories_source", "stories_unique_story", and "stories_urn_key" Regards, Masaru Sugawara