a wrong index choose when statistics is out of date

Поиск
Список
Период
Сортировка
От Andy Fan
Тема a wrong index choose when statistics is out of date
Дата
Msg-id 878r2zeqsp.fsf@163.com
обсуждение исходный текст
Ответы Re: a wrong index choose when statistics is out of date  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Re: a wrong index choose when statistics is out of date  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers
The issue can be reproduced with the following steps:

create table x_events (.., created_at timestamp, a int, b int); 

create index idx_1 on t(created_at, a);
create index idx_2 on t(created_at, b);

query:
select * from t where create_at = current_timestamp and b = 1;

index (created_at, a) rather than (created_at, b) may be chosen for the
above query if the statistics think "create_at = current_timestamp" has
no rows, then both index are OK, actually it is true just because
statistics is out of date.

I just run into this again recently and have two new idea this time,
I'd like gather some feedback on this.

1. We can let the user define the column as the value is increased day by
   day. the syntax may be:

   ALTER TABLE x_events ALTER COLUMN created_at ALWAYS_INCREASED.

   then when a query like 'create_at op const', the statistics module can
   treat it as 'created_at = $1'. so the missing statistics doesn't make
   difference. Then I think the above issue can be avoided. 

   This is different from letting user using a PreparedStmt directly
   because it is possible that we always choose a custom plan, the
   easiest way to make this happen is we do a planning time partition 
   prune.

2. Use some AI approach to forecast the data it doesn't gather yet. The
   training stage may happen at analyze stage, take the above case for
   example, it may get a model like 'there are 100 rows per second for
   the time during 9:00 to 18:00 and there are 2 rows per seconds for
   other time range.
   
For now, I think option 1 may be easier to happen.

-- 
Best Regards
Andy Fan




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jelte Fennema-Nio
Дата:
Сообщение: Re: Make query cancellation keys longer
Следующее
От: "Zhijie Hou (Fujitsu)"
Дата:
Сообщение: RE: Synchronizing slots from primary to standby