Re: Odd Row Estimates in Query Plan (rows=75)
От | Adrian Klaver |
---|---|
Тема | Re: Odd Row Estimates in Query Plan (rows=75) |
Дата | |
Msg-id | f7751be2-57f6-e75e-fbd3-8f38e51c8fbc@aklaver.com обсуждение исходный текст |
Ответ на | Odd Row Estimates in Query Plan (rows=75) (Don Seiler <don@seiler.us>) |
Ответы |
Re: Odd Row Estimates in Query Plan (rows=75)
(Don Seiler <don@seiler.us>)
|
Список | pgsql-general |
On 08/15/2018 12:31 PM, Don Seiler wrote: > PostgreSQL 9.6.6 on CentOS. > > We have a report query that has gone from maybe a few seconds to run to > a few minutes to run since mid-July. Looking at the output of EXPLAIN > ANALYZE, the row count estimates are way off, even though this table was > just analyzed a day or so ago. What's more bizarre to me is that the row > count esimate is *always* 75 for every node of the plan, where the > actual rows is in the hundreds or thousands. This table is one of the > busiest tables in our production database (many inserts and updates). It > is autovacuumed and autoanalyzed a few times per week, although I'm > looking to change it to a nightly manual schedule to avoid daytime > autovacuums. > > Hash Join (cost=1869142.34..1869146.15 rows=75 width=88) (actual > time=179877.869..179878.011 rows=759 loops=1) > Hash Cond: (stores.pkey = lt.store_pkey) > Buffers: shared hit=1654593 read=331897 dirtied=249 > -> Seq Scan on stores (cost=0.00..2.77 rows=77 width=22) (actual > time=0.007..0.023 rows=78 loops=1) > Buffers: shared hit=2 > -> Hash (cost=1869141.40..1869141.40 rows=75 width=50) (actual > time=179877.847..179877.847 rows=759 loops=1) > Buckets: 1024 Batches: 1 Memory Usage: 73kB > Buffers: shared hit=1654591 read=331897 dirtied=249 > -> Subquery Scan on lt (cost=1869138.59..1869141.40 rows=75 > width=50) (actual time=179875.976..179877.697 rows=759 loops=1) > Buffers: shared hit=1654591 read=331897 dirtied=249 > -> GroupAggregate (cost=1869138.59..1869140.65 rows=75 > width=50) (actual time=179875.976..179877.606 rows=759 loops=1) > Group Key: lts.store_pkey, lts.owner, > (date_trunc('minute'::text, lts.date_gifted)) > Filter: (count(*) IS NOT NULL) > Buffers: shared hit=1654591 read=331897 dirtied=249 > -> Sort (cost=1869138.59..1869138.78 rows=75 > width=42) (actual time=179875.961..179876.470 rows=6731 loops=1) > Sort Key: lts.store_pkey, > lts.entry_source_owner, (date_trunc('minute'::text, lts.date_gifted)) > Sort Method: quicksort Memory: 757kB > Buffers: shared hit=1654591 read=331897 > dirtied=249 > -> Index Scan using gifts_date_added on > gifts lts (cost=0.56..1869136.25 rows=75 width=42) (actual > time=190.657..179870.165 rows=6731 loops=1) > Index Cond: ((date_added > '2018-07-14 > 11:13:05'::timestamp without time zone) AND (date_added < '2018-08-13 > 14:14:21'::timestamp without time zone)) > Filter: ((date_gifted >= '2018-08-13 > 11:13:05'::timestamp without time zone) AND (date_gifted < '2018-08-13 > 14:14:21'::timestamp without time zone)) > Rows Removed by Filter: 938197 > Buffers: shared hit=1654591 > read=331897 dirtied=249 > Planning time: 0.426 ms > Execution time: 179893.894 ms > > I don't have a version of this query from prior to this summer, but > getting explain plan for older data from older sandboxes show a similar > plan. I don't have an answer, just a question: Can you provide the actual query and the table schema? > > Sidenote: I am suggesting that an index be added on the date_gifted > field as that is far more selective and avoids throwing rows away. > However I'm very interested in why every node dealing with the gifts > table thinks rows=75 when the actual is much, much higher. And 75 seems > like too round of a number to be random? > > -- > Don Seiler > www.seiler.us <http://www.seiler.us> -- Adrian Klaver adrian.klaver@aklaver.com
В списке pgsql-general по дате отправления: