Re: Too few rows expected by Planner on partitioned tables
От | Justin Pryzby |
---|---|
Тема | Re: Too few rows expected by Planner on partitioned tables |
Дата | |
Msg-id | 20200722144016.GV5748@telsasoft.com обсуждение исходный текст |
Ответ на | Too few rows expected by Planner on partitioned tables (Julian Wolf <julian.wolf@invenium.io>) |
Ответы |
Re: Too few rows expected by Planner on partitioned tables
Re: Too few rows expected by Planner on partitioned tables |
Список | pgsql-performance |
On Tue, Jul 21, 2020 at 01:09:22PM +0000, Julian Wolf wrote: > Hello, > > A description of what you are trying to achieve and what results you expect: > Our database is growing on a daily basis by about 2.5million rows per table (2 at the moment). Because of that, we decidedto partition the data, especially, as we are pre-aggregating the data for weeks, months, quarters and years. Everyaggregation is stored in a separate partition: > ... > Our problem is, that the planner always predicts one row to be returned, although only a part of the primary key is queried.This problem exceeds feasibility of performance rapidly - a query only involving a few days already takes dozensof seconds. All tables are analyzed and pg_stats looks reasonable IMHO. ... > PRIMARY KEY ( daterange, spatial_feature_id, visitor_profile_id, activity_type_combination_id, > activity_chain_id), ... > ) PARTITION BY LIST (daterange); > schemaname relname n_live_tup > mobility_insights location_statistics_y2019m03d 23569853 > mobility_insights location_statistics_y2019m03w 19264373 > mobility_insights location_statistics_y2019m03 18105295 > Aggregate (cost=2.79..2.80 rows=1 width=8) (actual time=143.073..143.073 rows=1 loops=1) > Buffers: shared hit=67334 > -> Index Scan using location_statistics_y2019m03w_pkey on location_statistics_y2019m03w st (cost=0.56..2.78 rows=1width=8) (actual time=0.026..117.284 rows=516277 loops=1) > Index Cond: ((daterange = '[2019-03-04,2019-03-11)'::daterange) AND (spatial_feature_id = 12675)) > Buffers: shared hit=67334 I guess this isn't actually the problem query, since it takes 143ms and not dozens of seconds. I don't know what is the problem query, but maybe it might help to create an new index on spatial_feature_id, which could be scanned rather than scanning the unique index. Also, if daterange *and* spatial_feature_id are always *both* included, then this might work: postgres=# CREATE STATISTICS t_stats (mcv) ON daterange,spatial_feature_id FROM t ; ANALYZE t; -- Justin
В списке pgsql-performance по дате отправления: