Re: So, is COUNT(*) fast now?
От | Anssi Kääriäinen |
---|---|
Тема | Re: So, is COUNT(*) fast now? |
Дата | |
Msg-id | 4EAEA7EB.2090501@thl.fi обсуждение исходный текст |
Ответ на | Re: So, is COUNT(*) fast now? (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: So, is COUNT(*) fast now?
|
Список | pgsql-hackers |
On 10/31/2011 02:44 PM, Robert Haas wrote: > What I think you're probably measuring here (oprofile would tell us > for sure) is that once the size of the table goes beyond about half a > gigabyte, it will have more than one page in the visibility map. The > index-only scan code keeps the most recently used visibility map page > pinned to save on overhead, but if you're bouncing back and forth > between data in the first ~500MB of the table and data in the last > ~100MB, each switch will result in dropping the current pin and > getting a new one, which figures to be fairly expensive. With the > table is only a little over 500GB, you're probably only changing VM > pages every couple of tuples, but with a 6GB table just about every > tuple will switch to a new VM page. > > Now, maybe you're right and the CPU caches are the more significant > effect. But I wouldn't like to bet on it without seeing how much the > drop-and-get-new-pin operations are costing us. > Maybe I should have left the analysis part out of the post, I don't know the internals, so my analysis is likely to be wrong. Now that I think of it, claiming that the cache effect is 50% of the runtime is likely a little wrong... However the part about clustering being important is still correct. According to the test, you can get 50% overhead because of random access to the VM. Stupid question, but why not keep the whole VM pinned? - Anssi
В списке pgsql-hackers по дате отправления: