Обсуждение: Query planner wants to use seq scan
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)
-> Seq Scan on external_sync_messages (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)
Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Rows Removed by Filter: 600140
Planning time: 0.490 ms
Execution time: 232.246 ms
(6 rows)
# set enable_seqscan = off;
SET
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)
-> Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)
Index Cond: (organization_id = 1612)
Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Planning time: 0.103 ms
Execution time: 0.052 ms
(6 rows)
# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');
attname | inherited | n_distinct | most_common_vals
-----------------+-----------+------------+------------------
handled_by | f | 3 | 3 +
| | | 236140 +
| | | 54413
organization_id | f | 22 | 1612 +
| | | 287 +
| | | 967 +
| | | 1223 +
| | | 1123 +
| | | 1930 +
| | | 841 +
| | | 1814 +
| | | 711 +
| | | 1513 +
| | | 1794 +
| | | 1246 +
| | | 1673 +
| | | 1552 +
| | | 1747 +
| | | 2611 +
| | | 2217 +
| | | 2448 +
| | | 2133 +
| | | 1861 +
| | | 2616 +
| | | 2796
status | f | 6 | ok +
| | | ignored +
| | | channel_error +
| | | in_progress +
| | | error +
| | | sent_to_proxy
(3 rows)
# select count(*) from external_sync_messages;
count
--------
992912
(1 row)
On 27.10.2015 12:35, Bertrand Paquet wrote:
> Hi all,
>
> We have a slow query. After analyzing, the planner decision seems to
> be discutable : the query is faster when disabling seqscan. See below
> the two query plan, and an extract from pg_stats.
>
> Any idea about what to change to help the planner ?
>
> An information which can be useful : the number on distinct value on
> organization_id is very very low, may be the planner does not known
> that, and take the wrong decision.
>
> Regards,
>
> Bertrand
>
> # explain analyze SELECT 1 AS one FROM "external_sync_messages"
> WHERE "external_sync_messages"."organization_id" = 1612 AND
> ("external_sync_messages"."status" NOT IN ('sent_to_proxy',
> 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS
> NULL LIMIT 1;
>
> QUERY PLAN
>
>
--------------------------------------------------------------------------------------------------------------------------------------------
>
> Limit (cost=0.00..12.39 rows=1 width=0) (actual
> time=232.212..232.213 rows=1 loops=1)
>
> -> Seq Scan on external_sync_messages (cost=0.00..79104.69
> rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)
>
> Filter: ((handled_by IS NULL) AND (organization_id = 1612)
> AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
>
> Rows Removed by Filter: 600140
>
> Planning time: 0.490 ms
>
> Execution time: 232.246 ms
>
> (6 rows)
>
> # set enable_seqscan = off;
>
> SET
>
> # explain analyze SELECT 1 AS one FROM "external_sync_messages"
> WHERE "external_sync_messages"."organization_id" = 1612 AND
> ("external_sync_messages"."status" NOT IN ('sent_to_proxy',
> 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS
> NULL LIMIT 1;
>
> QUERY PLAN
>
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Limit (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030
> rows=1 loops=1)
>
> -> Index Scan using
> index_external_sync_messages_on_organization_id on
> external_sync_messages (cost=0.42..251934.05 rows=6385 width=0)
> (actual time=0.028..0.028 rows=1 loops=1)
>
> Index Cond: (organization_id = 1612)
>
> Filter: ((handled_by IS NULL) AND ((status)::text <> ALL
> ('{sent_to_proxy,in_progress,ok}'::text[])))
>
> Planning time: 0.103 ms
>
> Execution time: 0.052 ms
>
> (6 rows)
>
> # SELECT attname, inherited,
> n_distinct, array_to_string(most_common_vals, E'\n') as
> most_common_vals FROM pg_stats WHERE tablename =
> 'external_sync_messages' and attname IN ('status', 'organization_id',
> 'handled_by');
>
> attname | inherited | n_distinct | most_common_vals
>
> -----------------+-----------+------------+------------------
>
> handled_by | f | 3 | 3 +
>
> | | | 236140 +
>
> | | | 54413
>
> organization_id | f | 22 | 1612 +
>
> | | | 287 +
>
> | | | 967 +
>
> | | | 1223 +
>
> | | | 1123 +
>
> | | | 1930 +
>
> | | | 841 +
>
> | | | 1814 +
>
> | | | 711 +
>
> | | | 1513 +
>
> | | | 1794 +
>
> | | | 1246 +
>
> | | | 1673 +
>
> | | | 1552 +
>
> | | | 1747 +
>
> | | | 2611 +
>
> | | | 2217 +
>
> | | | 2448 +
>
> | | | 2133 +
>
> | | | 1861 +
>
> | | | 2616 +
>
> | | | 2796
>
> status | f | 6 | ok +
>
> | | | ignored +
>
> | | | channel_error +
>
> | | | in_progress +
>
> | | | error +
>
> | | | sent_to_proxy
>
> (3 rows)
>
> # select count(*) from external_sync_messages;
>
> count
>
> --------
>
> 992912
>
> (1 row)
>
>
Hello, Bertrand!
May be statistics on external_sync_messages is wrong? i.e planner give
us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?
--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Hello, Bertrand!On 27.10.2015 12:35, Bertrand Paquet wrote:Hi all,
We have a slow query. After analyzing, the planner decision seems to be discutable : the query is faster when disabling seqscan. See below the two query plan, and an extract from pg_stats.
Any idea about what to change to help the planner ?
An information which can be useful : the number on distinct value on organization_id is very very low, may be the planner does not known that, and take the wrong decision.
Regards,
Bertrand
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)
-> Seq Scan on external_sync_messages (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)
Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Rows Removed by Filter: 600140
Planning time: 0.490 ms
Execution time: 232.246 ms
(6 rows)
# set enable_seqscan = off;
SET
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)
-> Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)
Index Cond: (organization_id = 1612)
Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Planning time: 0.103 ms
Execution time: 0.052 ms
(6 rows)
# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');
attname | inherited | n_distinct | most_common_vals
-----------------+-----------+------------+------------------
handled_by | f | 3 | 3 +
| | | 236140 +
| | | 54413
organization_id | f | 22 | 1612 +
| | | 287 +
| | | 967 +
| | | 1223 +
| | | 1123 +
| | | 1930 +
| | | 841 +
| | | 1814 +
| | | 711 +
| | | 1513 +
| | | 1794 +
| | | 1246 +
| | | 1673 +
| | | 1552 +
| | | 1747 +
| | | 2611 +
| | | 2217 +
| | | 2448 +
| | | 2133 +
| | | 1861 +
| | | 2616 +
| | | 2796
status | f | 6 | ok +
| | | ignored +
| | | channel_error +
| | | in_progress +
| | | error +
| | | sent_to_proxy
(3 rows)
# select count(*) from external_sync_messages;
count
--------
992912
(1 row)
May be statistics on external_sync_messages is wrong? i.e planner give us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?
--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
What is the result ofYes, I have run VACUUM ANALYZE, no effect.Bertrand2015-10-27 12:08 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:Hello, Bertrand!On 27.10.2015 12:35, Bertrand Paquet wrote:Hi all,
We have a slow query. After analyzing, the planner decision seems to be discutable : the query is faster when disabling seqscan. See below the two query plan, and an extract from pg_stats.
Any idea about what to change to help the planner ?
An information which can be useful : the number on distinct value on organization_id is very very low, may be the planner does not known that, and take the wrong decision.
Regards,
Bertrand
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)
-> Seq Scan on external_sync_messages (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)
Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Rows Removed by Filter: 600140
Planning time: 0.490 ms
Execution time: 232.246 ms
(6 rows)
# set enable_seqscan = off;
SET
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)
-> Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)
Index Cond: (organization_id = 1612)
Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Planning time: 0.103 ms
Execution time: 0.052 ms
(6 rows)
# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');
attname | inherited | n_distinct | most_common_vals
-----------------+-----------+------------+------------------
handled_by | f | 3 | 3 +
| | | 236140 +
| | | 54413
organization_id | f | 22 | 1612 +
| | | 287 +
| | | 967 +
| | | 1223 +
| | | 1123 +
| | | 1930 +
| | | 841 +
| | | 1814 +
| | | 711 +
| | | 1513 +
| | | 1794 +
| | | 1246 +
| | | 1673 +
| | | 1552 +
| | | 1747 +
| | | 2611 +
| | | 2217 +
| | | 2448 +
| | | 2133 +
| | | 1861 +
| | | 2616 +
| | | 2796
status | f | 6 | ok +
| | | ignored +
| | | channel_error +
| | | in_progress +
| | | error +
| | | sent_to_proxy
(3 rows)
# select count(*) from external_sync_messages;
count
--------
992912
(1 row)
May be statistics on external_sync_messages is wrong? i.e planner give us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?
--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
select relname,n_live_tup,n_dead_tup, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables where relname='external_sync_messages' ?
-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
relname | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze
------------------------+------------+------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------
external_sync_messages | 998105 | 11750 | 2015-10-26 20:15:17.484771+00 | 2015-10-02 15:04:25.944479+00 | 2015-10-26 20:15:19.465308+00 | 2015-10-22 12:24:26.947616+00
(1 row)
What is the result ofOn 27.10.2015 14:10, Bertrand Paquet wrote:Yes, I have run VACUUM ANALYZE, no effect.Bertrand2015-10-27 12:08 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:Hello, Bertrand!On 27.10.2015 12:35, Bertrand Paquet wrote:Hi all,
We have a slow query. After analyzing, the planner decision seems to be discutable : the query is faster when disabling seqscan. See below the two query plan, and an extract from pg_stats.
Any idea about what to change to help the planner ?
An information which can be useful : the number on distinct value on organization_id is very very low, may be the planner does not known that, and take the wrong decision.
Regards,
Bertrand
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)
-> Seq Scan on external_sync_messages (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)
Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Rows Removed by Filter: 600140
Planning time: 0.490 ms
Execution time: 232.246 ms
(6 rows)
# set enable_seqscan = off;
SET
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)
-> Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)
Index Cond: (organization_id = 1612)
Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Planning time: 0.103 ms
Execution time: 0.052 ms
(6 rows)
# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');
attname | inherited | n_distinct | most_common_vals
-----------------+-----------+------------+------------------
handled_by | f | 3 | 3 +
| | | 236140 +
| | | 54413
organization_id | f | 22 | 1612 +
| | | 287 +
| | | 967 +
| | | 1223 +
| | | 1123 +
| | | 1930 +
| | | 841 +
| | | 1814 +
| | | 711 +
| | | 1513 +
| | | 1794 +
| | | 1246 +
| | | 1673 +
| | | 1552 +
| | | 1747 +
| | | 2611 +
| | | 2217 +
| | | 2448 +
| | | 2133 +
| | | 1861 +
| | | 2616 +
| | | 2796
status | f | 6 | ok +
| | | ignored +
| | | channel_error +
| | | in_progress +
| | | error +
| | | sent_to_proxy
(3 rows)
# select count(*) from external_sync_messages;
count
--------
992912
(1 row)
May be statistics on external_sync_messages is wrong? i.e planner give us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?
--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
select relname,n_live_tup,n_dead_tup, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables where relname='external_sync_messages' ?-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
What is yours random_page_cost parameter in postgres config?relname | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze
------------------------+------------+------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------
external_sync_messages | 998105 | 11750 | 2015-10-26 20:15:17.484771+00 | 2015-10-02 15:04:25.944479+00 | 2015-10-26 20:15:19.465308+00 | 2015-10-22 12:24:26.947616+00
(1 row)
2015-10-27 12:17 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:What is the result ofOn 27.10.2015 14:10, Bertrand Paquet wrote:Yes, I have run VACUUM ANALYZE, no effect.Bertrand2015-10-27 12:08 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:Hello, Bertrand!On 27.10.2015 12:35, Bertrand Paquet wrote:Hi all,
We have a slow query. After analyzing, the planner decision seems to be discutable : the query is faster when disabling seqscan. See below the two query plan, and an extract from pg_stats.
Any idea about what to change to help the planner ?
An information which can be useful : the number on distinct value on organization_id is very very low, may be the planner does not known that, and take the wrong decision.
Regards,
Bertrand
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)
-> Seq Scan on external_sync_messages (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)
Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Rows Removed by Filter: 600140
Planning time: 0.490 ms
Execution time: 232.246 ms
(6 rows)
# set enable_seqscan = off;
SET
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)
-> Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)
Index Cond: (organization_id = 1612)
Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Planning time: 0.103 ms
Execution time: 0.052 ms
(6 rows)
# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');
attname | inherited | n_distinct | most_common_vals
-----------------+-----------+------------+------------------
handled_by | f | 3 | 3 +
| | | 236140 +
| | | 54413
organization_id | f | 22 | 1612 +
| | | 287 +
| | | 967 +
| | | 1223 +
| | | 1123 +
| | | 1930 +
| | | 841 +
| | | 1814 +
| | | 711 +
| | | 1513 +
| | | 1794 +
| | | 1246 +
| | | 1673 +
| | | 1552 +
| | | 1747 +
| | | 2611 +
| | | 2217 +
| | | 2448 +
| | | 2133 +
| | | 1861 +
| | | 2616 +
| | | 2796
status | f | 6 | ok +
| | | ignored +
| | | channel_error +
| | | in_progress +
| | | error +
| | | sent_to_proxy
(3 rows)
# select count(*) from external_sync_messages;
count
--------
992912
(1 row)
May be statistics on external_sync_messages is wrong? i.e planner give us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?
--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
select relname,n_live_tup,n_dead_tup, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables where relname='external_sync_messages' ?-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.
> Any idea about what to change to help the planner ?
Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.
If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.
regards, tom lane
show random_page_cost ;
random_page_cost
------------------
4
(1 row)
What is yours random_page_cost parameter in postgres config?On 27.10.2015 14:19, Bertrand Paquet wrote:relname | n_live_tup | n_dead_tup | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze
------------------------+------------+------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------
external_sync_messages | 998105 | 11750 | 2015-10-26 20:15:17.484771+00 | 2015-10-02 15:04:25.944479+00 | 2015-10-26 20:15:19.465308+00 | 2015-10-22 12:24:26.947616+00
(1 row)
2015-10-27 12:17 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:What is the result ofOn 27.10.2015 14:10, Bertrand Paquet wrote:Yes, I have run VACUUM ANALYZE, no effect.Bertrand2015-10-27 12:08 GMT+01:00 Alex Ignatov <a.ignatov@postgrespro.ru>:Hello, Bertrand!On 27.10.2015 12:35, Bertrand Paquet wrote:Hi all,
We have a slow query. After analyzing, the planner decision seems to be discutable : the query is faster when disabling seqscan. See below the two query plan, and an extract from pg_stats.
Any idea about what to change to help the planner ?
An information which can be useful : the number on distinct value on organization_id is very very low, may be the planner does not known that, and take the wrong decision.
Regards,
Bertrand
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..12.39 rows=1 width=0) (actual time=232.212..232.213 rows=1 loops=1)
-> Seq Scan on external_sync_messages (cost=0.00..79104.69 rows=6385 width=0) (actual time=232.209..232.209 rows=1 loops=1)
Filter: ((handled_by IS NULL) AND (organization_id = 1612) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Rows Removed by Filter: 600140
Planning time: 0.490 ms
Execution time: 232.246 ms
(6 rows)
# set enable_seqscan = off;
SET
# explain analyze SELECT 1 AS one FROM "external_sync_messages" WHERE "external_sync_messages"."organization_id" = 1612 AND ("external_sync_messages"."status" NOT IN ('sent_to_proxy', 'in_progress', 'ok')) AND "external_sync_messages"."handled_by" IS NULL LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..39.88 rows=1 width=0) (actual time=0.030..0.030 rows=1 loops=1)
-> Index Scan using index_external_sync_messages_on_organization_id on external_sync_messages (cost=0.42..251934.05 rows=6385 width=0) (actual time=0.028..0.028 rows=1 loops=1)
Index Cond: (organization_id = 1612)
Filter: ((handled_by IS NULL) AND ((status)::text <> ALL ('{sent_to_proxy,in_progress,ok}'::text[])))
Planning time: 0.103 ms
Execution time: 0.052 ms
(6 rows)
# SELECT attname, inherited, n_distinct, array_to_string(most_common_vals, E'\n') as most_common_vals FROM pg_stats WHERE tablename = 'external_sync_messages' and attname IN ('status', 'organization_id', 'handled_by');
attname | inherited | n_distinct | most_common_vals
-----------------+-----------+------------+------------------
handled_by | f | 3 | 3 +
| | | 236140 +
| | | 54413
organization_id | f | 22 | 1612 +
| | | 287 +
| | | 967 +
| | | 1223 +
| | | 1123 +
| | | 1930 +
| | | 841 +
| | | 1814 +
| | | 711 +
| | | 1513 +
| | | 1794 +
| | | 1246 +
| | | 1673 +
| | | 1552 +
| | | 1747 +
| | | 2611 +
| | | 2217 +
| | | 2448 +
| | | 2133 +
| | | 1861 +
| | | 2616 +
| | | 2796
status | f | 6 | ok +
| | | ignored +
| | | channel_error +
| | | in_progress +
| | | error +
| | | sent_to_proxy
(3 rows)
# select count(*) from external_sync_messages;
count
--------
992912
(1 row)
May be statistics on external_sync_messages is wrong? i.e planner give us rows=6385 but seq scan give us Rows Removed by Filter: 600140
Maybe you should recalc it by VACUUM ANALYZE it?
--
Alex Ignatov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
select relname,n_live_tup,n_dead_tup, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze from pg_stat_user_tables where relname='external_sync_messages' ?-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Le mardi 27 octobre 2015, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.
> Any idea about what to change to help the planner ?
Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.
If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.
regards, tom lane
Le mardi 27 octobre 2015, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.
> Any idea about what to change to help the planner ?
Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.
If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.
regards, tom lane
Hi tom,I did the test yesterday with an index on the three fields, and with a partial index on organization and status and where is null condition on handled. I saw no modification on query plan.May be I forgot to analyze vacuum after. I will retry tonight.I use a btree index. Is it the good solution, even with the In clause ?Regards,Bertrand
Le mardi 27 octobre 2015, Tom Lane <tgl@sss.pgh.pa.us> a écrit :Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.
> Any idea about what to change to help the planner ?
Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.
If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.
regards, tom lane
On 10/27/15 3:56 PM, Bertrand Paquet wrote: > Tonight, the index on the three field is used, may be my yesterday > vacuum updated stats. BTW, you can run just ANALYZE, which is *far* faster than a VACUUM on a large table. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
Hello Bertrand once again!So,Tonight, the index on the three field is used, may be my yesterday vacuum updated stats.Thx you for your help.Regards,Bertrand
2015-10-27 18:33 GMT+01:00 Bertrand Paquet <bertrand.paquet@doctolib.fr>:Hi tom,I did the test yesterday with an index on the three fields, and with a partial index on organization and status and where is null condition on handled. I saw no modification on query plan.May be I forgot to analyze vacuum after. I will retry tonight.I use a btree index. Is it the good solution, even with the In clause ?Regards,Bertrand
Le mardi 27 octobre 2015, Tom Lane <tgl@sss.pgh.pa.us> a écrit :Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.
> Any idea about what to change to help the planner ?
Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.
If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.
regards, tom lane
What's your status? Does the plan changed after deploying three field index ?
-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Hello Bertrand once again!On 27.10.2015 23:56, Bertrand Paquet wrote:So,Tonight, the index on the three field is used, may be my yesterday vacuum updated stats.Thx you for your help.Regards,Bertrand
2015-10-27 18:33 GMT+01:00 Bertrand Paquet <bertrand.paquet@doctolib.fr>:Hi tom,I did the test yesterday with an index on the three fields, and with a partial index on organization and status and where is null condition on handled. I saw no modification on query plan.May be I forgot to analyze vacuum after. I will retry tonight.I use a btree index. Is it the good solution, even with the In clause ?Regards,Bertrand
Le mardi 27 octobre 2015, Tom Lane <tgl@sss.pgh.pa.us> a écrit :Bertrand Paquet <bertrand.paquet@doctolib.fr> writes:
> We have a slow query. After analyzing, the planner decision seems to be
> discutable : the query is faster when disabling seqscan. See below the two
> query plan, and an extract from pg_stats.
> Any idea about what to change to help the planner ?
Neither one of those plans is very good: you're just hoping that the
Filter condition will let a tuple through sooner rather than later.
If you care about the performance of this type of query, I'd consider
creating an index on (organization_id, status, handled_by) so that all
the conditions can be checked in the index.
regards, tom lane
What's your status? Does the plan changed after deploying three field index ?-- Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company