Re: PG using index+filter instead only use index

Поиск
Список
Период
Сортировка
От Alexandre de Arruda Paes
Тема Re: PG using index+filter instead only use index
Дата
Msg-id fe86db181003191404i68a5521cs5976887abd51faa1@mail.gmail.com
обсуждение исходный текст
Ответ на Re: PG using index+filter instead only use index  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: PG using index+filter instead only use index  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
Hi Tom,

2010/3/19 Tom Lane <tgl@sss.pgh.pa.us>:
> Alexandre de Arruda Paes <adaldeia@gmail.com> writes:
>> My question: if the cost is exactly the same, why PG choose the index
>> ict13t2 on ct13t and apply a filter instead use the primary key ?
>
> Why shouldn't it, if the estimated costs are the same?  You didn't
> actually demonstrate they're the same though.
>
> The cost estimates look a bit unusual to me; are you using nondefault
> cost parameters, and if so what are they?
>
>                        regards, tom lane
>

The non default value in cost parameters is different only in
random_page_cost that are set to 2.5 and default_statistics_target set
to 300.
I set this parameters to defaults (4 and 100) and re-analyze the
tables but results are the same.

Some more info on another table with the same behavior (ANALYZE ok in
all tables):

client=# \d ct14t
          Table "public.ct14t"
   Column   |     Type      | Modifiers
------------+---------------+-----------
 ct14emp04  | integer       | not null
 ct03emp01  | integer       | not null
 ct03tradut | integer       | not null
 ct07emp01  | integer       | not null
 ct07c_cust | integer       | not null
 ct14ano    | integer       | not null
 ct14mes    | integer       | not null
 ct14debito | numeric(14,2) |
 ct14credit | numeric(14,2) |
 ct14orcado | numeric(14,2) |
Indexes:
    "ct14t_pkey" PRIMARY KEY, btree (ct14emp04, ct03emp01, ct03tradut,
ct07emp01, ct07c_cust, ct14ano, ct14mes) CLUSTER
    "ad_ict14t" btree (ct14emp04, ct03emp01, ct03tradut, ct07emp01,
ct07c_cust, ct14ano, ct14mes) WHERE ct14emp04 = 2 AND ct03emp01 = 2
AND ct07emp01 = 2
    "ict14t1" btree (ct07emp01, ct07c_cust)
    "ict14t2" btree (ct03emp01, ct03tradut)

client=# select ct07c_cust,count(*) from ct14t group by ct07c_cust
order by count(*) DESC;
 ct07c_cust | count
------------+-------
          0 | 55536
         99 | 14901
        107 |  3094
        800 |  1938
(...)


If I use any different value from '0' in the ct07c_cust field, the
planner choose the 'right' index:

client=# explain analyze SELECT ct14mes, ct14ano, ct07c_cust,
ct07emp01, ct03tradut, ct03emp01, ct14emp04, ct14debito, ct14credit
FROM ad_CT14T WHERE ct14emp04 = '2' AND ct03emp01 = '2' AND ct03tradut
= '14930' AND ct07emp01 = '2' AND ct07c_cust = '99' AND ct14ano =
'2003' AND ct14mes = '4';

   QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using ad_ict14t_1 on ad_ct14t  (cost=0.00..5.28 rows=1
width=42) (actual time=5.504..5.504 rows=0 loops=1)
   Index Cond: ((ct14emp04 = 2) AND (ct03emp01 = 2) AND (ct03tradut =
14930) AND (ct07emp01 = 2) AND (ct07c_cust = 99) AND (ct14ano = 2003)
AND (ct14mes = 4))
 Total runtime: 5.548 ms
(3 rows)



With '0' in the ct07c_cust field, they choose a more slow way:


client=# explain analyze SELECT ct14mes, ct14ano, ct07c_cust,
ct07emp01, ct03tradut, ct03emp01, ct14emp04, ct14debito, ct14credit
FROM CT14T WHERE ct14emp04 = '2' AND ct03emp01 = '2' AND ct03tradut =
'57393' AND ct07emp01 = '2' AND ct07c_cust = '0' AND ct14ano = '2002'
AND ct14mes = '5';
                                                     QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 Index Scan using ict14t1 on ct14t  (cost=0.00..5.32 rows=1 width=42)
(actual time=211.007..211.007 rows=0 loops=1)
   Index Cond: ((ct07emp01 = 2) AND (ct07c_cust = 0))
   Filter: ((ct14emp04 = 2) AND (ct03emp01 = 2) AND (ct03tradut =
57393) AND (ct14ano = 2002) AND (ct14mes = 5))
 Total runtime: 211.062 ms
(4 rows)


Again, if I create a table for test from this table (AD_CT14T) and
only create the index used in the first query plan, the results are ok
(ct07c_cust=0 / same query above):

client=# create table ad_ct14t as select * from ct14t;
SELECT
client=# create index ad_ict14t_abc on ad_ct14t(ct14emp04, ct03emp01,
ct03tradut, ct07emp01, ct07c_cust, ct14ano, ct14mes) where ct14emp04 =
'2' AND ct03emp01 = '2' AND ct07emp01 = '2';
CREATE
client=# explain analyze SELECT ct14mes, ct14ano, ct07c_cust,
ct07emp01, ct03tradut, ct03emp01, ct14emp04, ct14debito, ct14credit
FROM AD_CT14T WHERE ct14emp04 = '2' AND ct03emp01 = '2' AND ct03tradut
= '57393' AND ct07emp01 = '2' AND ct07c_cust = '0' AND ct14ano =
'2002' AND ct14mes = '5';

   QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using ad_ict14t_abc on ad_ct14t  (cost=0.00..5.28 rows=1
width=42) (actual time=0.043..0.043 rows=0 loops=1)
   Index Cond: ((ct14emp04 = 2) AND (ct03emp01 = 2) AND (ct03tradut =
57393) AND (ct07emp01 = 2) AND (ct07c_cust = 0) AND (ct14ano = 2002)
AND (ct14mes = 5))
 Total runtime: 0.091 ms
(3 rows)



I don't know why the planner prefer to use a less specific index
(ict14t1) and do a filter than use an index that matches with the
WHERE parameter...

Best regards,

Alexandre

В списке pgsql-performance по дате отправления:

Предыдущее
От: Yeb Havinga
Дата:
Сообщение: Re: GiST index performance
Следующее
От: Kenneth Marshall
Дата:
Сообщение: Re: GiST index performance