Re: Query using SeqScan instead of IndexScan
От | Brendan Duddridge |
---|---|
Тема | Re: Query using SeqScan instead of IndexScan |
Дата | |
Msg-id | 95C1A48B-A76F-4F9B-B5EE-0BFAF8873067@clickspace.com обсуждение исходный текст |
Ответ на | Re: Query using SeqScan instead of IndexScan ("Jim C. Nasby" <jnasby@pervasive.com>) |
Ответы |
Re: Query using SeqScan instead of IndexScan
|
Список | pgsql-performance |
Hi Jim, from SELECT * FROM pg_stats WHERE tablename='table' AND attname='category_id' I find correlation on category_product for category_id is 0.643703 Would setting the index on category_id to be clustered help with this? Thanks, ____________________________________________________________________ Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com On Apr 1, 2006, at 8:32 AM, Jim C. Nasby wrote: > On Sat, Apr 01, 2006 at 11:23:37AM +1000, chris smith wrote: >> On 4/1/06, Brendan Duddridge <brendan@clickspace.com> wrote: >>> Hi Jim, >>> >>> I'm not quite sure what you mean by the correlation of category_id? >> >> It means how many distinct values does it have (at least that's my >> understanding of it ;) ). > > Your understanding is wrong. :) What you're discussing is n_distinct. > > http://www.postgresql.org/docs/8.1/interactive/view-pg-stats.html > > correlation: "Statistical correlation between physical row ordering > and > logical ordering of the column values. This ranges from -1 to +1. When > the value is near -1 or +1, an index scan on the column will be > estimated to be cheaper than when it is near zero, due to reduction of > random access to the disk. (This column is NULL if the column data > type > does not have a < operator.)" > > In other words, the following will have a correlation of 1: > > 1 > 2 > 3 > ... > 998 > 999 > 1000 > > And this is -1... > > 1000 > 999 > ... > 2 > 1 > > While this would have a very low correlation: > > 1 > 1000 > 2 > 999 > ... > > The lower the correlation, the more expensive an index scan is, > because > it's more random. As I mentioned, I believe that the current index > scan > cost estimator is flawed though, because it will bias heavily against > correlations that aren't close to 1 or -1. > > So, what does > > SELECT * FROM pg_stats WHERE tablename='table' AND > attname='category_id'; > > show? > -- > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > Pervasive Software http://pervasive.com work: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org >
Вложения
В списке pgsql-performance по дате отправления: