Обсуждение: strange fsm issues
The DB with the large objects that I had trouble dumping two weeks ago is now exhibiting some interesting fsm issues. The DB stores lots of large objects used for medical research statistics and the data is generally input during the day (9am-3pm pacific time) and evening (7pm-10pm pacific time). I noticed a fsm warning when vacuum verbose last week, so I had scheduled to increase max_fsm_pages to 50000. This was the warning I was receiving: Jun 20 09:22:58 newmars postgres[25754]: [2-2] HINT: Consider increasing the configuration parameter "max_fsm_pages" to a value over 42784. I increased the setting to 50000, restarted postgres and reran the vacuum verbose. I was greeted with the warning once again. :-( Jun 21 07:46:42 newmars postgres[4329]: [2-2] HINT: Consider increasing the configuration parameter "max_fsm_pages" to a value over 52128. Ok, I must not have increased it enough to accomodate the changes from yesterday to today...so, I increased it again to 60000 and re-ran the vacuum verbose: Jun 21 08:15:36 newmars postgres[4724]: [2-2] HINT: Consider increasing the configuration parameter "max_fsm_pages" to a value over 62608. What the heck? Nobody is accessing the DB but me....so I decided to just go overboard and set it to 100000. Changed it, restarted postgres, vacuum verbose: INFO: free space map contains 98441 pages in 125 relations DETAIL: A total of 100000 page slots are in use (including overhead). 102608 page slots are required to track all free space. Current limits are: 100000 page slots, 2000 relations, using 713 KB. NOTICE: number of page slots needed (102608) exceeds max_fsm_pages (100000) HINT: Consider increasing the configuration parameter "max_fsm_pages" to a value over 102608. Unfortunately my screen back buffer didn't have the other vacuum verbose outputs, so I had to pull the warnings out of the log file. Note that it's again exactly 2608 above the setting. That seems oddly coincidental. Any suggestions on this one? It's postgresql-8.1.4 compiled from the source tarball. Autovacuum is turned on and I'd love for it to be able to keep up. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On Wed, 21 Jun 2006, Jeff Frost wrote: > The DB with the large objects that I had trouble dumping two weeks ago is now > exhibiting some interesting fsm issues. The DB stores lots of large objects > used for medical research statistics and the data is generally input during > the day (9am-3pm pacific time) and evening (7pm-10pm pacific time). I > noticed a fsm warning when vacuum verbose last week, so I had scheduled to > increase max_fsm_pages to 50000. This was the warning I was receiving: > > Jun 20 09:22:58 newmars postgres[25754]: [2-2] HINT: Consider increasing the > configuration parameter "max_fsm_pages" to a value over 42784. > > I increased the setting to 50000, restarted postgres and reran the vacuum > verbose. I was greeted with the warning once again. :-( > > Jun 21 07:46:42 newmars postgres[4329]: [2-2] HINT: Consider increasing the > configuration parameter "max_fsm_pages" to a value over 52128. > > Ok, I must not have increased it enough to accomodate the changes from > yesterday to today...so, I increased it again to 60000 and re-ran the vacuum > verbose: > > Jun 21 08:15:36 newmars postgres[4724]: [2-2] HINT: Consider increasing the > configuration parameter "max_fsm_pages" to a value over 62608. > > What the heck? Nobody is accessing the DB but me....so I decided to just go > overboard and set it to 100000. Changed it, restarted postgres, vacuum > verbose: > > INFO: free space map contains 98441 pages in 125 relations > DETAIL: A total of 100000 page slots are in use (including overhead). > 102608 page slots are required to track all free space. > Current limits are: 100000 page slots, 2000 relations, using 713 KB. > NOTICE: number of page slots needed (102608) exceeds max_fsm_pages (100000) > HINT: Consider increasing the configuration parameter "max_fsm_pages" to a > value over 102608. > > Unfortunately my screen back buffer didn't have the other vacuum verbose > outputs, so I had to pull the warnings out of the log file. > > Note that it's again exactly 2608 above the setting. That seems oddly > coincidental. Any suggestions on this one? It's postgresql-8.1.4 compiled > from the source tarball. Autovacuum is turned on and I'd love for it to be > able to keep up. So, I ran vacuumlo on the DB and it removed a few orphaned LOBs, but still vacuum verbose yields the same. Connected to vsl_cs Checking datafile in public.study_action_history Removed 15 large objects from vsl_cs. INFO: free space map contains 98443 pages in 125 relations DETAIL: A total of 100000 page slots are in use (including overhead). 102608 page slots are required to track all free space. Current limits are: 100000 page slots, 2000 relations, using 713 KB. NOTICE: number of page slots needed (102608) exceeds max_fsm_pages (100000) HINT: Consider increasing the configuration parameter "max_fsm_pages" to a value over 102608. The DB is actually in active use now but the FSM suggestion is still 102608. Very strange. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On Wed, 21 Jun 2006, Jeff Frost wrote: >> Current limits are: 100000 page slots, 2000 relations, using 713 KB. >> NOTICE: number of page slots needed (102608) exceeds max_fsm_pages >> (100000) >> HINT: Consider increasing the configuration parameter "max_fsm_pages" to a >> value over 102608. >> >> Unfortunately my screen back buffer didn't have the other vacuum verbose >> outputs, so I had to pull the warnings out of the log file. >> So, I set it to 3366499. I came up with this number by: select sum(relpages) from pg_class where relkind in ('r','t'); and adding that up for all the DBs (there is just the one plus postgres and template1 and now I get th efollowing, so it appears this was actually a good settings. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: free space map contains 3339998 pages in 125 relations DETAIL: A total of 3341600 page slots are in use (including overhead). 3341600 page slots are required to track all free space. Current limits are: 3366499 page slots, 2000 relations, using 19853 KB. So then the question is..why was vacuum suggesting such low settings previously? -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On Jun 22, 2006, at 9:52 AM, Jeff Frost wrote: > So, I set it to 3366499. I came up with this number by: > select sum(relpages) from pg_class where relkind in ('r','t'); > and adding that up for all the DBs (there is just the one plus > postgres and template1 and now I get th efollowing, so it appears > this was actually a good settings. > > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: free space map contains 3339998 pages in 125 relations > DETAIL: A total of 3341600 page slots are in use (including > overhead). > 3341600 page slots are required to track all free space. > Current limits are: 3366499 page slots, 2000 relations, using > 19853 KB. > > So then the question is..why was vacuum suggesting such low > settings previously? Take a look at the source, but I think the issue is that it stops trying to keep track after some point. In any case, for this database I doubt it matters, since there's free space on every page... afaik any update will first try and add the new tuple on the current page, so as long as you have enough in the FSM to handle new inserts it won't matter. Of course, the fact that you have that much free space on every page makes me think it's not getting vacuumed enough... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Thu, 22 Jun 2006, Jim Nasby wrote: >> CPU 0.00s/0.00u sec elapsed 0.00 sec. >> INFO: free space map contains 3339998 pages in 125 relations >> DETAIL: A total of 3341600 page slots are in use (including overhead). >> 3341600 page slots are required to track all free space. >> Current limits are: 3366499 page slots, 2000 relations, using 19853 KB. >> >> So then the question is..why was vacuum suggesting such low settings >> previously? > > In any case, for this database I doubt it matters, since there's free space > on every page... afaik any update will first try and add the new tuple on the > current page, so as long as you have enough in the FSM to handle new inserts > it won't matter. > > Of course, the fact that you have that much free space on every page makes me > think it's not getting vacuumed enough... Interestingly, it was just restored from a dump a week ago to do the 8.1.4 upgrade and autovacuum has been turned on with rather aggressive settings the entire time. Now that we've got a workable FSM pages number, we're going to vacuum full the db this weekend to get it back on track then watch the vacuum verbose output and see how it's going. pg_stat_database shows extremely limited activity: vsl_cs=# select * from pg_stat_database ; datid | datname | numbackends | xact_commit | xact_rollback | blks_read | blks_hit -------+-----------+-------------+-------------+---------------+-----------+---------- 10793 | postgres | 0 | 527 | 0 | 896 | 42774 1 | template1 | 0 | 525 | 0 | 869 | 41190 16404 | vsl_cs | 1 | 2305 | 0 | 3623144 | 3709983 only 2,305 commits since 7:30 a.m. this morning when the db was restarted. That's not too much..I wonder if having large objects makes for special vacuuming fun..or maybe autovacuum can't deal with large objects? I didn't find anything like that in the docs. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On Thu, 22 Jun 2006, Jeff Frost wrote: > On Thu, 22 Jun 2006, Jim Nasby wrote: > >>> CPU 0.00s/0.00u sec elapsed 0.00 sec. >>> INFO: free space map contains 3339998 pages in 125 relations >>> DETAIL: A total of 3341600 page slots are in use (including overhead). >>> 3341600 page slots are required to track all free space. >>> Current limits are: 3366499 page slots, 2000 relations, using 19853 KB. So, now after setting it so high and seeing the above output, we ran vacuum verbose again today to check it and it looks like this now: CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: free space map contains 27804 pages in 125 relations DETAIL: A total of 29600 page slots are in use (including overhead). 29600 page slots are required to track all free space. Current limits are: 3366499 page slots, 2000 relations, using 19853 KB. What the heck? Why would it have shrunk down so much? The vacuum full isn't scheduled until tonight. :-/ -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On Jun 23, 2006, at 5:31 PM, Jeff Frost wrote: > On Thu, 22 Jun 2006, Jeff Frost wrote: >> On Thu, 22 Jun 2006, Jim Nasby wrote: >>>> CPU 0.00s/0.00u sec elapsed 0.00 sec. >>>> INFO: free space map contains 3339998 pages in 125 relations >>>> DETAIL: A total of 3341600 page slots are in use (including >>>> overhead). >>>> 3341600 page slots are required to track all free space. >>>> Current limits are: 3366499 page slots, 2000 relations, using >>>> 19853 KB. > > So, now after setting it so high and seeing the above output, we > ran vacuum verbose again today to check it and it looks like this now: > > CPU 0.00s/0.00u sec elapsed 0.00 sec. > INFO: free space map contains 27804 pages in 125 relations > DETAIL: A total of 29600 page slots are in use (including overhead). > 29600 page slots are required to track all free space. > Current limits are: 3366499 page slots, 2000 relations, using > 19853 KB. > > What the heck? Why would it have shrunk down so much? The vacuum > full isn't scheduled until tonight. :-/ Take a look at the source for autovacuum... I'm guessing you're right and that it doesn't make any special considerations for toast tables, which it probably should (since AFAIK you have to first vacuum the base table, then commit, then vacuum the toast table). BTW, tracking pg_class.relpages or actual file size over time would probably be helpful. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Tue, 27 Jun 2006, Jim Nasby wrote: >> What the heck? Why would it have shrunk down so much? The vacuum full >> isn't scheduled until tonight. :-/ > > Take a look at the source for autovacuum... I'm guessing you're right and > that it doesn't make any special considerations for toast tables, which it > probably should (since AFAIK you have to first vacuum the base table, then > commit, then vacuum the toast table). > > BTW, tracking pg_class.relpages or actual file size over time would probably > be helpful. Actually, it says this: /* * Scan pg_class and determine which tables to vacuum. * * The stats subsystem collects stats for toast tables independently of * the stats for their parent tables. We need to check those stats since * in cases with short, wide tables there might be proportionally much * more activity in the toast table than in its parent. * * Since we can only issue VACUUM against the parent table, we need to * transpose a decision to vacuum a toast table into a decision to vacuum * its parent. There's no point in considering ANALYZE on a toast table, * either. To support this, we keep a list of OIDs of toast tables that * need vacuuming alongside the list of regular tables. Regular tables * will be entered into the table list even if they appear not to need * vacuuming; we go back and re-mark them after finding all the vacuumable * toast tables. */ So I guess it does take toast tables into account. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On Tue, Jun 27, 2006 at 08:02:33PM -0700, Jeff Frost wrote: > On Tue, 27 Jun 2006, Jim Nasby wrote: > > >>What the heck? Why would it have shrunk down so much? The vacuum full > >>isn't scheduled until tonight. :-/ > > > >Take a look at the source for autovacuum... I'm guessing you're right and > >that it doesn't make any special considerations for toast tables, which it > >probably should (since AFAIK you have to first vacuum the base table, then > >commit, then vacuum the toast table). > > > >BTW, tracking pg_class.relpages or actual file size over time would > >probably be helpful. > > Actually, it says this: > > /* > * Scan pg_class and determine which tables to vacuum. > * > * The stats subsystem collects stats for toast tables independently of > * the stats for their parent tables. We need to check those stats since > * in cases with short, wide tables there might be proportionally much > * more activity in the toast table than in its parent. > * > * Since we can only issue VACUUM against the parent table, we need to > * transpose a decision to vacuum a toast table into a decision to vacuum > * its parent. There's no point in considering ANALYZE on a toast table, > * either. To support this, we keep a list of OIDs of toast tables that > * need vacuuming alongside the list of regular tables. Regular tables > * will be entered into the table list even if they appear not to need > * vacuuming; we go back and re-mark them after finding all the vacuumable > * toast tables. > */ > > So I guess it does take toast tables into account. Only if it issues 2 vacuums on the base tables... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461