Re: [PERFORM] Querying with multicolumn index
От | Daniel Blanch Bataller |
---|---|
Тема | Re: [PERFORM] Querying with multicolumn index |
Дата | |
Msg-id | 36867AC4-DDD8-464E-A0F7-066B4C88300C@gmail.com обсуждение исходный текст |
Ответ на | Re: [PERFORM] Querying with multicolumn index (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: [PERFORM] Querying with multicolumn index
|
Список | pgsql-performance |
Hi all, Thomas is absolutely right, the distribution I synthetically made, had 6M records but very old, 9M old, as you can see ithad to skip 9M records before finding a suitable record using time index. EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM updates WHERE driver_id = 100 ORDER BY "time" DESC LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=0.44..0.65 rows=1 width=36) (actual time=3827.807..3827.807 rows=1 loops=1) Buffers: shared hit=24592 read=99594 written=659 -> Index Scan Backward using updates_time_idx on updates (cost=0.44..1284780.53 rows=6064800 width=36) (actual time=3827.805..3827.805rows=1 loops=1) Filter: (driver_id = 100) Rows Removed by Filter: 9000000 Buffers: shared hit=24592 read=99594 written=659 Planning time: 0.159 ms Execution time: 3827.846 ms (8 rows) Here you have my tests where I was able to reproduce the problem using default settings on 9.6, 9.5 and 9.3. 9.6 and 9.5choose the wrong index, while 9.3 didn’t. (update: 9.5 didn’t fail last time) However when I tried to add more than one value with this strange distribution ~ 30% of distribution to one value the indexbad choice problem didn’t happen again in none of the different versions. I Hope this helps. Regards, Daniel Blanch. > El 10 dic 2016, a las 21:34, Tomas Vondra <tomas.vondra@2ndquadrant.com> escribió: > > Hi, > > On 12/10/2016 12:51 AM, Tom Lane wrote: >> Eric Jiang <eric@doublemap.com> writes: >>> I have a query that I *think* should use a multicolumn index, but >>> sometimes isn't, resulting in slow queries. >> >> I tried to duplicate this behavior, without success. Are you running >> with nondefault planner parameters? >> > > My guess is this is a case of LIMIT the matching rows are uniformly distributed in the input data. The planner likely concludesthat for a driver with a lot of data we'll find the first row using ix_updates_time very quickly, and that it willbe cheaper than inspecting the larger multi-column index. But imagine a driver with a lots of data long time ago. Thatbreaks the LIMIT fairly quickly. > > regards > > -- > Tomas Vondra http://www.2ndQuadrant.com > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
Вложения
В списке pgsql-performance по дате отправления: