Re: best statistic target for boolean columns
От | Gregory Stark |
---|---|
Тема | Re: best statistic target for boolean columns |
Дата | |
Msg-id | 87k6uf7ceu.fsf@stark.xeocode.com обсуждение исходный текст |
Ответ на | best statistic target for boolean columns (Gaetano Mendola <mendola@bigfoot.com>) |
Ответы |
Re: best statistic target for boolean columns
|
Список | pgsql-performance |
> Gaetano, > > > don't you think the best statistic target for a boolean > > column is something like 2? Or in general the is useless > > have a statistics target > data type cardinality ? > > It depends, really, on the proportionality of the boolean values; if they're > about equal, I certainly wouldn't raise Stats from the default of 10. If, > however, it's very dispraportionate -- like 2% true and 98% false -- then it > may pay to have better statistics so that the planner doesn't assume 50% > hits, which it otherwise might. No, actually the stats table keeps the n most common values and their frequency (usually in percentage). So really a target of 2 ought to be enough for boolean values. In fact that's all I see in pg_statistic; I'm assuming there's a full histogram somewhere but I don't see it. Where would it be? However the target also dictates how large a sample of the table to take. A target of two represents a very small sample. So the estimations could be quite far off. I ran the experiment and for a table with 2036 false rows out of 204,624 the estimate was 1720. Not bad. But then I did vacuum full analyze and got an estimate of 688. Which isn't so good. -- greg
В списке pgsql-performance по дате отправления: