Re: Simple join doesn't use index

Поиск
Список
Период
Сортировка
От Alex Vinnik
Тема Re: Simple join doesn't use index
Дата
Msg-id CALd8TVEqBETA41m2Ev+fbZMCjZAuhN=zQfHwTvMrihoWSTGeJQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Simple join doesn't use index  (Merlin Moncure <mmoncure@gmail.com>)
Ответы Re: Simple join doesn't use index  (Filip Rembiałkowski <plk.zuber@gmail.com>)
Re: Simple join doesn't use index  (Merlin Moncure <mmoncure@gmail.com>)
Re: Simple join doesn't use index  (Jeff Janes <jeff.janes@gmail.com>)
Re: Simple join doesn't use index  (Jeff Janes <jeff.janes@gmail.com>)
Список pgsql-performance
It sure turned out that default settings are not a good fit. Setting random_page_cost to 1.0 made query to run in 2.6 seconds and I clearly see that indexes are being used in explain plan and IO utilization is close to 0.

QUERY PLAN
Sort  (cost=969787.23..970288.67 rows=200575 width=8) (actual time=2176.045..2418.162 rows=241238 loops=1)
  Sort Key: visits.id, views.id
  Sort Method: external sort  Disk: 4248kB
  ->  Nested Loop  (cost=0.00..950554.81 rows=200575 width=8) (actual time=0.048..1735.357 rows=241238 loops=1)
        ->  Index Scan using visits_created_at_index on visits  (cost=0.00..5459.16 rows=82561 width=4) (actual time=0.032..178.591 rows=136021 loops=1)
              Index Cond: ((created_at >= '2012-12-15 00:00:00'::timestamp without time zone) AND (created_at < '2012-12-16 00:00:00'::timestamp without time zone))
        ->  Index Scan using views_visit_id_index on views  (cost=0.00..11.33 rows=12 width=8) (actual time=0.004..0.006 rows=2 loops=136021)
              Index Cond: (visit_id = visits.id)
Total runtime: 2635.169 ms

However I noticed that sorting is done using disk("external sort  Disk: 4248kB") which prompted me to take a look at work_mem. But it turned out that small increase to 4MB from default 1MB turns off index usage and query gets x10 slower. IO utilization jumped to 100% from literally nothing. so back to square one...

QUERY PLAN
Sort  (cost=936642.75..937144.19 rows=200575 width=8) (actual time=33200.762..33474.443 rows=241238 loops=1)
  Sort Key: visits.id, views.id
  Sort Method: external merge  Disk: 4248kB
  ->  Hash Join  (cost=6491.17..917410.33 rows=200575 width=8) (actual time=7156.498..32723.221 rows=241238 loops=1)
        Hash Cond: (views.visit_id = visits.id)
        ->  Seq Scan on views  (cost=0.00..832189.95 rows=8768395 width=8) (actual time=0.100..12126.342 rows=8200704 loops=1)
        ->  Hash  (cost=5459.16..5459.16 rows=82561 width=4) (actual time=353.683..353.683 rows=136021 loops=1)
              Buckets: 16384  Batches: 2 (originally 1)  Memory Usage: 4097kB
              ->  Index Scan using visits_created_at_index on visits  (cost=0.00..5459.16 rows=82561 width=4) (actual time=0.032..175.051 rows=136021 loops=1)
                    Index Cond: ((created_at >= '2012-12-15 00:00:00'::timestamp without time zone) AND (created_at < '2012-12-16 00:00:00'::timestamp without time zone))
Total runtime: 33698.000 ms

Basically PG is going through all views again and not using "Index Scan using views_visit_id_index on views". Looks like setting work_mem confuses planner somehow. Any idea what can be done to do sorting in memory. I suspect it should make query even more faster. Thanks -Alex



nothing wrong with that, but keep in mind you can tweak
'effective_cache_size' for a single session with 'set' command;

merlin

В списке pgsql-performance по дате отправления:

Предыдущее
От: Rosser Schwarz
Дата:
Сообщение: Re: Triggers and transactions
Следующее
От: Filip Rembiałkowski
Дата:
Сообщение: Re: Simple join doesn't use index