Re: Index Scan Costs versus Sort
От | Charlie Savage |
---|---|
Тема | Re: Index Scan Costs versus Sort |
Дата | |
Msg-id | 437398E3.2020900@interserv.com обсуждение исходный текст |
Ответ на | Re: Index Scan Costs versus Sort (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Index Scan Costs versus Sort
|
Список | pgsql-performance |
Hi Tom, From pg_stats: schema = "tiger"; tablename = "completechain"; attname = "tlid"; null_frac = 0; avg_width = 4; n_distinct = -1; most_common_vals = ; most_common_freqs = ; correlation = 0.155914; Note that I have default_statistics_target set to 100. Here is the first few values from histogram_bounds: "{102450,2202250,4571797,6365754,8444936,10541593,12485818,14545727,16745594,18421868,20300549,22498643,24114709,26301001,28280632,30370123,32253657,33943046,35898115,37499478,39469054,41868498,43992143,45907830,47826340,49843926,52051798,54409298,56447416, The tlid column is a US Census bureau ID assigned to each chain in the US - where a chain is a road segment, river segment, railroad segment, etc. The data is loaded on state-by-state basis, and then a county-by-county basis. There is no overall ordering to TLIDs, although perhaps there is some local ordering at the county level (but from a quick look at the data I don't see any, and the correlation factor indicates there isn't any if I am interpreting it correctly). Any other info that would be helpful to see? Charlie Tom Lane wrote: > Charlie Savage <cfis@interserv.com> writes: >> 1. Postgresql estimates the index scan will be 50 times more costly >> than the seq scan (112870376 vs 2229858) yet in fact it only takes 3 >> times longer to execute (2312426 s vs. 768403 s). My understanding is >> that postgresql assumes, via the random_page_cost parameter, that an >> index scan will take 4 times longer than a sequential scan. So why is >> the analyzer estimating it is 50 times slower? > > The other factors that are likely to affect this are index correlation > and effective cache size. It's fairly remarkable that a full-table > index scan only takes 3 times longer than a seqscan; you must have both > a high correlation and a reasonably large cache. You showed us your > effective_cache_size setting, but what's the pg_stats entry for > completechain.tlid contain? Can you quantify what the physical > ordering of tlid values is likely to be? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 5: don't forget to increase your free space map settings >
В списке pgsql-performance по дате отправления: