Re: An "obvious" index not being used
От | Daniele Varrazzo |
---|---|
Тема | Re: An "obvious" index not being used |
Дата | |
Msg-id | 488222A7.1000202@develer.com обсуждение исходный текст |
Ответ на | Re: An "obvious" index not being used (Francisco Reyes <lists@stringsutils.com>) |
Ответы |
Re: An "obvious" index not being used
|
Список | pgsql-performance |
Francisco Reyes writes: > Daniele Varrazzo writes: > >> I suspect the foo.account_id statistical data are not used at all in >> query: the query planner can only estimate the number of accounts to >> look for, not > > You mentioned you bumped your default_statistics_target. > What did you increase it to? > My data sets are so "strange" that anything less than 350 gives many bad > plans. Not default_statistics_target: I used "ALTER TABLE SET STATISTICS" to change the stats only for the tables I was interested in, arriving up to 1000. I think the result is the same, but it was a red herring anyway: these stats couldn't be used at all in my query. In my problem I had 2 tables: a small one (accounts), a large one (foo). The way the query is written doesn't allow the stats from the large table to be used at all, unless the records from the small table are fetched. This is independent from the stats accuracy. What the planner does is to assume an even distribution in the data in the joined fields. The assumption is probably better than not having anything, but in my data set (where there were a bunch of accounts with many foo each,but many accounts with too little foo) this proved false. The stats can be used only if at planning time the planner knows what values to look for in the field: this is the reason for which, if the query is split in two parts, performances become acceptable. In this case we may fall in your situation: a data set may be "strange" and thus require an increase in the stats resolution. I can't remember if the default 10 was too low, but 100 was definitely enough for me. It would be nice if the planner could perform the "split query" optimization automatically, i.e. fetch records from small tables to plan the action on larger tables. But I suspect this doesn't fit at all in the current PostgreSQL query pipeline... or does it? -- Daniele Varrazzo - Develer S.r.l. http://www.develer.com
В списке pgsql-performance по дате отправления: