Re: big select is resulting in a large amount of disk writing by kjournald
От | Joseph S |
---|---|
Тема | Re: big select is resulting in a large amount of disk writing by kjournald |
Дата | |
Msg-id | hfp83f$1dcd$1@news.hub.org обсуждение исходный текст |
Ответ на | Re: big select is resulting in a large amount of disk writing by kjournald (Greg Smith <greg@2ndquadrant.com>) |
Список | pgsql-performance |
Greg Smith wrote: > Joseph S wrote: >> Greg Smith wrote: >>> Joseph S wrote: >>>> So I run "select count(*) from large_table" and I see in xosview a >>>> solid block of write activity. Runtime is 28125.644 ms for the first >>>> run. The second run does not show a block of write activity and >>>> takes 3327.441 ms >>> http://wiki.postgresql.org/wiki/Hint_Bits >>> >> >> Hmm. A large select results in a lot of writes? This seems broken. >> And if we are setting these hint bits then what do we need VACUUM >> for? Is there any way to tune this behavior? Is there any way to get >> stats on how many rows/pages would need hint bits set? > Basically, the idea is that if you're pulling a page in for something > else that requires you to compute the hint bits, just do it now so > VACUUM doesn't have to later, while you're in there anyway. Why make > VACUUM do the work later if you're doing part of it now anyway? If you Then why not do all the work the VACUUM does? What does VACUUM do anyway? > reorganize your test to VACUUM first *before* running the "select (*) > from...", you'll discover the writes during SELECT go away. You're > running into the worst-case behavior. For example, if you inserted a > bunch of things more slowly, you might discover that autovacuum would do > this cleanup before you even got to looking at the data. I think autovacuum did hit these tables after slony copied them (I remember seeing them running). Would the hint bits be set during an reindex? For example the indexing slony does after the initial copy? I'm not sure if slony commits the transaction before it does the reindex. It probably doesn't. > > downsides. This situation shows up a lot when you're benchmarking > things, but not as much in the real world, so it's hard to justify > improving. > Actually I think I have been running into this situation. There were many reports that ran much faster the second time around than the first and I assumed it was just because the data was in memory cache. Now I'm thinking I was running into this.
В списке pgsql-performance по дате отправления: