Обсуждение: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Alexandre Leclerc
Дата:
Hi all,
I'm sorry for the urgency of the question. (We have a customer whose DB is "down" since 36 hours and business operations are compromised. Thank you for your help.)
Background:
- PostgreSQL 8.1 on Windows Server
- The customer has disabled regular VACUUM jobs for backup to be taken, a year or two ago.
- They didn't told us (as far as I can remember).
- Wednesday morning at 10:55:50: database is shut down to avoid wraparound data loss in database *db*
What has been done:
- The message requested a VACUUM FULL so we stopped the postmaster and started postgres.exe to launch a VACUUM FULL.
- During the night an employee of our client has stop (CTRL+C) and restarted many many times the VACUUM FULL (trying to see the progress of it).
- So yesterday morning, knowing that we gave instructions to let the job go without interruptions, which they did.
- It worked for about 24 hours now, and we don't see the end of it. The DB folder is now 38 GB (original DB was probably around 7GB of real data - but these were the numbers two ago).
What can we do?
- 1.1 Can we stop the VACUUM FULL (CTRL+C in the shell) and start postmaster again? So the company can continue working and then continue the VACUUM FULL during the weekend?
- 1.2 Will the transactions to avoid warparound data loss be available (in part at least) even if we stop the vacuum?
- 2. Could we stop VACUUM FULL and simply restart postmaster and starting a normal VACUUM even if it's slow?
- 3. Is it possible to increase the transactions limit to something bigger as a temporary solution so that the customer can continue its work?
I feel the pain of my client and understand that they need access to their data, but I would not like to loose all the cleaning that has been done. They have couple tables that are couple GB in size. And if stopping the VACUUM FULL does not give access to available transactions before the warparound shut down security, well it give nothing to stop it: that would be worst.
I would need an expert's advice on the question.
Being very grateful for your help,
I'm sorry for the urgency of the question. (We have a customer whose DB is "down" since 36 hours and business operations are compromised. Thank you for your help.)
Background:
- PostgreSQL 8.1 on Windows Server
- The customer has disabled regular VACUUM jobs for backup to be taken, a year or two ago.
- They didn't told us (as far as I can remember).
- Wednesday morning at 10:55:50: database is shut down to avoid wraparound data loss in database *db*
What has been done:
- The message requested a VACUUM FULL so we stopped the postmaster and started postgres.exe to launch a VACUUM FULL.
- During the night an employee of our client has stop (CTRL+C) and restarted many many times the VACUUM FULL (trying to see the progress of it).
- So yesterday morning, knowing that we gave instructions to let the job go without interruptions, which they did.
- It worked for about 24 hours now, and we don't see the end of it. The DB folder is now 38 GB (original DB was probably around 7GB of real data - but these were the numbers two ago).
What can we do?
- 1.1 Can we stop the VACUUM FULL (CTRL+C in the shell) and start postmaster again? So the company can continue working and then continue the VACUUM FULL during the weekend?
- 1.2 Will the transactions to avoid warparound data loss be available (in part at least) even if we stop the vacuum?
- 2. Could we stop VACUUM FULL and simply restart postmaster and starting a normal VACUUM even if it's slow?
- 3. Is it possible to increase the transactions limit to something bigger as a temporary solution so that the customer can continue its work?
I feel the pain of my client and understand that they need access to their data, but I would not like to loose all the cleaning that has been done. They have couple tables that are couple GB in size. And if stopping the VACUUM FULL does not give access to available transactions before the warparound shut down security, well it give nothing to stop it: that would be worst.
I would need an expert's advice on the question.
Being very grateful for your help,
-- Alexandre Leclerc
Alexandre Leclerc wrote: > - 2. Could we stop VACUUM FULL and simply restart postmaster and > starting a normal VACUUM even if it's slow? This is what you want to do. VACUUM FULL is the slow way--much, much slower--and it is not needed to clean up from wraparound issues. Here's two more opinions on that: http://archives.postgresql.org/message-id/5224.1255142294@sss.pgh.pa.us And here's some notes about why you should avoid VACUUM FULL: http://wiki.postgresql.org/wiki/VACUUM_FULL (Note that you don't have a good CLUSTER implementation in your 8.1 server, so using that to clean up tables isn't an option) -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
"Kevin Grittner"
Дата:
Alexandre Leclerc <aleclerc@ipso.ca> wrote: > - PostgreSQL 8.1 on Windows Server That's not a supported environment. http://www.postgresql.org/about/news.865 They should really be looking at upgrading. > - The customer has disabled regular VACUUM jobs for backup to be > taken, a year or two ago. Ouch. Once you recover from the immediate emergency, you they need to institute a sane vacuum/analyze policy, probably using some combination of autovacuum and scheduled database vacuums. > - Wednesday morning at 10:55:50: database is shut down to avoid > wraparound data loss in database *db* > - The message requested a VACUUM FULL No. It didn't. It said "execute a full-database VACUUM". > so we stopped the postmaster and started postgres.exe to launch a > VACUUM FULL. It was a mistake to use FULL, since that can run for days, and will bloat indexes. It's almost never the right thing to do. > - During the night an employee of our client has stop (CTRL+C) > and restarted many many times the VACUUM FULL (trying to see the > progress of it). Each time that's done it will add bloat, making things worse. > - So yesterday morning, knowing that we gave instructions to let > the job go without interruptions, which they did. > - It worked for about 24 hours now, and we don't see the end of > it. The DB folder is now 38 GB (original DB was probably around > 7GB of real data - but these were the numbers two ago). Yeah, you're going to want to clean up all the bloat from these mis-steps, but you have more immediate issues. > - 2. Could we stop VACUUM FULL and simply restart postmaster and > starting a normal VACUUM even if it's slow? I would do that, but I'm not at all sure it would be safe for anyone to try to use the database before the VACUUM completes. Once the database VACUUM completes, they can use the database, but they're likely to notice it's a bit slow. In the first available maintenance window after that (e.g., a weekend), I would recommend that they do a pg_dump of the database and restore it, followed by VACUUM ANALYZE (again, not FULL). And then they should work out a plan for an upgrade to a supported version. > - 3. Is it possible to increase the transactions limit to > something bigger as a temporary solution so that the customer can > continue its work? No. -Kevin
Alexandre Leclerc <aleclerc@ipso.ca> writes: > *Background:* > - PostgreSQL 8.1 on Windows Server > - The customer has disabled regular VACUUM jobs for backup to be taken, > a year or two ago. > - They didn't told us (as far as I can remember). > - Wednesday morning at 10:55:50: database is shut down to avoid > wraparound data loss in database *db* > *What has been done:* > - The message requested a VACUUM FULL so we stopped the postmaster and > started postgres.exe to launch a VACUUM FULL. You misread it. You do NOT need a VACUUM FULL here, you should just run a plain VACUUM (across the whole database, ie not one per table). It will take a lot less time. After you get out of this, you really need to do something about upgrading from 8.1, or else moving the server off Windows. 8.1 on Windows is unsupported and is full of known problems for that platform. regards, tom lane
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Alexandre Leclerc
Дата:
Hi all, I might have a problem of a greater order, but I can't see how to get an answer. (Indeed the message didn't say anything about VACUUM FULL... I miss interpreted the message.) The messages says to VACUUM the database postgres. When I execute: postgres -D "D:\my\path" postgres VACUUM; I'm always getting: WARNING: db "template1" must be vacuumed within 999593 transactions HINT: To avoid... execute a full-database VACUUM in "template1" ... (repeated many times until 999568) Then I try the same in template1: postgres -D "D:\my\path" postgres VACUUM; WARNING: db "postgres" must be vacuumed within 999*** transactions HINT: To avoid... execute a full-database VACUUM in "postgres" I tried VACUUM FREEZE / VACUUM FULL... At some point I got: ERROR: xlog flush request AC/FBEEF148 is not satisfied --- flushed only to AC/FB9224A8 CONTEXT: writing block 0 of relation 1664/0/1214 WARNING: could not writing block 0 of 1664/0/1214 DETAIL: Multiple failures --- write error may be permanent. Now my customer tells me they had a power outage last Sunday. Their might be a HD problem (?). Is the best solution to dump all / recreate the DB from scratch? What else can I do? It looks like the vacuum command does not want to be execute. The other VACUUM is still progressing on the main database in another postgres.exe shell. (If it's the only solution, is it possible to migration from one DB directly to the other under windows... I don't know about the | command under "windows cmd".) Thank you for your help. Le 2010-04-16 10:46, Tom Lane a écrit : > Alexandre Leclerc<aleclerc@ipso.ca> writes: > >> *Background:* >> - PostgreSQL 8.1 on Windows Server >> - The customer has disabled regular VACUUM jobs for backup to be taken, >> a year or two ago. >> - They didn't told us (as far as I can remember). >> - Wednesday morning at 10:55:50: database is shut down to avoid >> wraparound data loss in database *db* >> > >> *What has been done:* >> - The message requested a VACUUM FULL so we stopped the postmaster and >> started postgres.exe to launch a VACUUM FULL. >> > You misread it. You do NOT need a VACUUM FULL here, you should just run > a plain VACUUM (across the whole database, ie not one per table). It > will take a lot less time. > > After you get out of this, you really need to do something about > upgrading from 8.1, or else moving the server off Windows. 8.1 on > Windows is unsupported and is full of known problems for that platform. > > regards, tom lane > > -- Alexandre Leclerc
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Alexandre Leclerc
Дата:
Hi again, I also want to mention that maybe I'm not doing it properly. I started "postgres.exe" and it is inside that "session", "backend>" prompt, that I did run the VACUUM command. Is it that way or should I use psql to connect to anything "postgres.exe" would have "done" (like listening to a port?). (And by the way, what is the command to quit the backend? I'm doing Ctrl+C but the DB does not like that...) Best regards, Alexandre Leclerc Le 2010-04-16 10:46, Tom Lane a écrit : > Alexandre Leclerc<aleclerc@ipso.ca> writes: > >> *Background:* >> - PostgreSQL 8.1 on Windows Server >> - The customer has disabled regular VACUUM jobs for backup to be taken, >> a year or two ago. >> - They didn't told us (as far as I can remember). >> - Wednesday morning at 10:55:50: database is shut down to avoid >> wraparound data loss in database *db* >> > >> *What has been done:* >> - The message requested a VACUUM FULL so we stopped the postmaster and >> started postgres.exe to launch a VACUUM FULL. >> > You misread it. You do NOT need a VACUUM FULL here, you should just run > a plain VACUUM (across the whole database, ie not one per table). It > will take a lot less time. > > After you get out of this, you really need to do something about > upgrading from 8.1, or else moving the server off Windows. 8.1 on > Windows is unsupported and is full of known problems for that platform. > > regards, tom lane > > -- Alexandre Leclerc
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
"Kevin Grittner"
Дата:
Alexandre Leclerc <aleclerc@ipso.ca> wrote: > At some point I got: > ERROR: xlog flush request AC/FBEEF148 is not satisfied --- flushed > only to AC/FB9224A8 > CONTEXT: writing block 0 of relation 1664/0/1214 > WARNING: could not writing block 0 of 1664/0/1214 > DETAIL: Multiple failures --- write error may be permanent. You're not running out of disk space where that writes, are you? > It looks like the vacuum command does not want to be execute. The > other VACUUM is still progressing on the main database in another > postgres.exe shell. Wait -- are you saying you're running two postgres instances against the same data directory at the same time? (I sure hope not.) > (If it's the only solution, is it possible to migration from one > DB directly to the other under windows... I don't know about the | > command under "windows cmd".) Yeah, that should work as long as you have the disk space for both copies. -Kevin
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
"Kevin Grittner"
Дата:
Alexandre Leclerc <aleclerc@ipso.ca> wrote: > I also want to mention that maybe I'm not doing it properly. > > I started "postgres.exe" and it is inside that "session", > "backend>" prompt, that I did run the VACUUM command. Is it that > way Yes, that's the single-user mode. Just don't run more than one with the same -D value at the same time; more than one single-user backend would not be good. > (And by the way, what is the command to quit the backend? I'm > doing Ctrl+C but the DB does not like that...) In Linux it would be Ctrl+D but in Windows I think it's Ctrl+Z -Kevin
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Alexandre Leclerc
Дата:
Le 2010-04-16 14:18, Kevin Grittner a écrit : > Alexandre Leclerc<aleclerc@ipso.ca> wrote: > > >> At some point I got: >> ERROR: xlog flush request AC/FBEEF148 is not satisfied --- flushed >> only to AC/FB9224A8 >> CONTEXT: writing block 0 of relation 1664/0/1214 >> WARNING: could not writing block 0 of 1664/0/1214 >> DETAIL: Multiple failures --- write error may be permanent. >> > > You're not running out of disk space where that writes, are you? > There is 32 GB free. > > >> It looks like the vacuum command does not want to be execute. The >> other VACUUM is still progressing on the main database in another >> postgres.exe shell. >> > > Wait -- are you saying you're running two postgres instances against > the same data directory at the same time? (I sure hope not.) > I did. :( Shame on me. I just realised while reading doc on postgres that it is not made for that but only for a single instance at the time. I hope I did not break anything. >> (If it's the only solution, is it possible to migration from one >> DB directly to the other under windows... I don't know about the | >> command under "windows cmd".) >> > > Yeah, that should work as long as you have the disk space for both > copies. > > -Kevin > Thank you Kevin for you time, it's most appreciated. Right now I'll leave the standard VACUUM complete it's job. Then I'll vacuum the postgres and template1 DB if required. Best regards, -- Alexandre Leclerc
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
"Joshua D. Drake"
Дата:
On Fri, 2010-04-16 at 14:47 -0400, Alexandre Leclerc wrote: > I did. :( Shame on me. I just realised while reading doc on postgres > that it is not made for that but only for a single instance at the time. > I hope I did not break anything. How in the world did you pull that off? PostgreSQL checks for that. Either way, if you actually managed to start two services against the same data directory, I hope you have a backup, you can restore from. Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
"Kevin Grittner"
Дата:
"Joshua D. Drake" <jd@commandprompt.com> wrote: > if you actually managed to start two services against the > same data directory, I hope you have a backup, you can restore > from. This is 8.1 under Windows, and he connected to a different database with each backend. He got errors writing the WAL files, and it apparently wouldn't let him start a second VACUUM on the other database. I'm hoping that the initial VACUUM (of the big database) can continue and the WAL problems will cycle out without corrupting anything. Is that overly optimistic? It did occur to me that maybe the manual (and maybe event the error's hint) should say right up front that once the service stops a file-based copy of the database should be made (if at all possible) before proceeding to attempt recovery. Also, the "full-database vacuum" terminology seems too likely to be interpreted as VACUUM FULL for best results. Perhaps it's worth changing that to just "database vacuum" or "vacuum of the entire database"? -Kevin
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Scott Marlowe
Дата:
On Fri, Apr 16, 2010 at 12:47 PM, Alexandre Leclerc <aleclerc@ipso.ca> wrote: > Le 2010-04-16 14:18, Kevin Grittner a écrit : >> >> Alexandre Leclerc<aleclerc@ipso.ca> wrote: >> >> >>> >>> At some point I got: >>> ERROR: xlog flush request AC/FBEEF148 is not satisfied --- flushed >>> only to AC/FB9224A8 >>> CONTEXT: writing block 0 of relation 1664/0/1214 >>> WARNING: could not writing block 0 of 1664/0/1214 >>> DETAIL: Multiple failures --- write error may be permanent. >>> >> >> You're not running out of disk space where that writes, are you? >> > > There is 32 GB free. > >> >> >>> >>> It looks like the vacuum command does not want to be execute. The >>> other VACUUM is still progressing on the main database in another >>> postgres.exe shell. >>> >> >> Wait -- are you saying you're running two postgres instances against >> the same data directory at the same time? (I sure hope not.) >> > > I did. :( Shame on me. I just realised while reading doc on postgres that it > is not made for that but only for a single instance at the time. I hope I > did not break anything. You've almost certainly corrupted the data store. This is why you should always make a complete file system backup with the postmaster stopped before you start doing rescue work.
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Alexandre Leclerc
Дата:
Le 2010-04-16 15:20, Scott Marlowe a écrit : > On Fri, Apr 16, 2010 at 12:47 PM, Alexandre Leclerc<aleclerc@ipso.ca> wrote: > >> Le 2010-04-16 14:18, Kevin Grittner a écrit : >> >>> Alexandre Leclerc<aleclerc@ipso.ca> wrote: >>> >>> >>> >>>> At some point I got: >>>> ERROR: xlog flush request AC/FBEEF148 is not satisfied --- flushed >>>> only to AC/FB9224A8 >>>> CONTEXT: writing block 0 of relation 1664/0/1214 >>>> WARNING: could not writing block 0 of 1664/0/1214 >>>> DETAIL: Multiple failures --- write error may be permanent. >>>> >>>> >>> You're not running out of disk space where that writes, are you? >>> >>> >> There is 32 GB free. >> >> >>> >>> >>>> It looks like the vacuum command does not want to be execute. The >>>> other VACUUM is still progressing on the main database in another >>>> postgres.exe shell. >>>> >>>> >>> Wait -- are you saying you're running two postgres instances against >>> the same data directory at the same time? (I sure hope not.) >>> >>> >> I did. :( Shame on me. I just realised while reading doc on postgres that it >> is not made for that but only for a single instance at the time. I hope I >> did not break anything. >> > You've almost certainly corrupted the data store. This is why you > should always make a complete file system backup with the postmaster > stopped before you start doing rescue work. > > Hi Scott and Kevin, Thank you for these comments. But I'm learning the hard way right now since all this has already been done. I appreaciate the suggestions of Kevin on the terminology (since I'm not a native English speaking guy) and complementing the help. I can assure that it would have been clear if worded like that, for me at least. Hopefully, our customer is supposed to have a full file backup from the evening. So they would have loose only 3 hours of work in that respect. So the plan is that after the vacuum if the DB is not working well, we start from the backup, 3 hours before the problem, and we run a VACUUM using pgAdmin with the normal postmaster during the weekend. Again, thank you very much for all your time and help. It really is helping and supportive for me (even if I'm making mistakes). Best regards, -- Alexandre Leclerc
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > "Joshua D. Drake" <jd@commandprompt.com> wrote: >> if you actually managed to start two services against the >> same data directory, I hope you have a backup, you can restore >> from. > This is 8.1 under Windows, and he connected to a different database > with each backend. He got errors writing the WAL files, and it > apparently wouldn't let him start a second VACUUM on the other > database. I'm hoping that the initial VACUUM (of the big database) > can continue and the WAL problems will cycle out without corrupting > anything. Is that overly optimistic? Maybe, but if he doesn't have a recent backup then that's probably the best thing to try. I'm not actually sure how he would've started two standalone backends though --- there *is* an interlock against that, just as there is for two postmasters in the same data directory. Maybe if he was bullheaded enough to remove the lock file manually :-( > Also, the > "full-database vacuum" terminology seems too likely to be > interpreted as VACUUM FULL for best results. Perhaps it's worth > changing that to just "database vacuum" or "vacuum of the entire > database"? We did change that ... http://archives.postgresql.org/pgsql-committers/2008-12/msg00096.php regards, tom lane
Kevin Grittner wrote: > Also, the "full-database vacuum" terminology seems too likely to be > interpreted as VACUUM FULL for best results. Perhaps it's worth > changing that to just "database vacuum" or "vacuum of the entire > database" http://archives.postgresql.org/pgsql-committers/2008-12/msg00096.php It's "database-wide" now instead of "full-database". -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
"Kevin Grittner"
Дата:
Alexandre Leclerc <aleclerc@ipso.ca> wrote: > our customer is supposed to have a full file backup from the > evening. That's very good news, but given that they've not been going "by the book" in all respects, it pays to be cautious here. Did they make the copy while the database service was shut down? If not, did they follow all the rules for a PITR backup? http://www.postgresql.org/docs/8.1/interactive/backup.html In any event, take great care that the backup is not overwritten, deleted, or *moved* to be used for recovery. You really want to protect that and keep it until the database has been up and running for a few weeks, at least. -Kevin
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Alexandre Leclerc
Дата:
Le 2010-04-16 15:44, Tom Lane a écrit : > "Kevin Grittner"<Kevin.Grittner@wicourts.gov> writes: > >> "Joshua D. Drake"<jd@commandprompt.com> wrote: >> >>> if you actually managed to start two services against the >>> same data directory, I hope you have a backup, you can restore >>> from. >>> > > >> This is 8.1 under Windows, and he connected to a different database >> with each backend. He got errors writing the WAL files, and it >> apparently wouldn't let him start a second VACUUM on the other >> database. I'm hoping that the initial VACUUM (of the big database) >> can continue and the WAL problems will cycle out without corrupting >> anything. Is that overly optimistic? >> > Maybe, but if he doesn't have a recent backup then that's probably the > best thing to try. I'm not actually sure how he would've started two > standalone backends though --- there *is* an interlock against that, > just as there is for two postmasters in the same data directory. > Maybe if he was bullheaded enough to remove the lock file manually :-( > > The backup should work ok. The postmaster was closed every night for file-backup. The vacuum raised a "max_fsm_pages" of 142000 not enought and stopped. Is increasing the number enought to have it continue or other parameters are required? (Or is there a way in 8.1 to increate the memory for maintenance?) (Is there a quick hint to calculate the size required?) Spec of the Server: - Windows Server 2003 / 32 bits - 3 GB ram (Now I understand why an initial DB of 6 GB is now 38 GB: vacuuming has been stopped and space wasted since!) As a side question, is it possible to make a pg_dumpall on a DB that could have been potentially damaged by the two postgres.exe executions at the same time? (We did play arround with file read-only state in the /base folder but not in this purpose: it was to make sure the DB was not read only. Maybe the error message arrived after this manipulation, I can't remember. But yes the two postgres program executed on the same "base" folder, but not the same DB.) Maybe our best solution is start over from the backup. >> Also, the >> "full-database vacuum" terminology seems too likely to be >> interpreted as VACUUM FULL for best results. Perhaps it's worth >> changing that to just "database vacuum" or "vacuum of the entire >> database"? >> > We did change that ... > http://archives.postgresql.org/pgsql-committers/2008-12/msg00096.php > > That is great. -- Alexandre Leclerc
Alexandre Leclerc <aleclerc@ipso.ca> writes: > I'm always getting: > WARNING: db "template1" must be vacuumed within 999593 transactions > HINT: To avoid... execute a full-database VACUUM in "template1" > ... (repeated many times until 999568) Yeah, I think you will get that bleat once per table processed, until you've resolved the problem (which will likely take vacuuming each database in the installation). Don't sweat it. regards, tom lane
You could temporarily increase the fsm size in the postgres configuration so as to be able to properly map all the free space. I think you're going to do a dump/restore in due course in order to return the database to something like it's normal size, at which point (if you're RAM constrained) you might want to revert that parameter. Cheers, Robin On Fri, 16 Apr 2010 15:56:03 -0400, Alexandre Leclerc <aleclerc@ipso.ca> wrote: > Le 2010-04-16 15:44, Tom Lane a écrit : >> "Kevin Grittner"<Kevin.Grittner@wicourts.gov> writes: >> >>> "Joshua D. Drake"<jd@commandprompt.com> wrote: >>> >>>> if you actually managed to start two services against the >>>> same data directory, I hope you have a backup, you can restore >>>> from. >>>> >> >> >>> This is 8.1 under Windows, and he connected to a different database >>> with each backend. He got errors writing the WAL files, and it >>> apparently wouldn't let him start a second VACUUM on the other >>> database. I'm hoping that the initial VACUUM (of the big database) >>> can continue and the WAL problems will cycle out without corrupting >>> anything. Is that overly optimistic? >>> >> Maybe, but if he doesn't have a recent backup then that's probably the >> best thing to try. I'm not actually sure how he would've started two >> standalone backends though --- there *is* an interlock against that, >> just as there is for two postmasters in the same data directory. >> Maybe if he was bullheaded enough to remove the lock file manually :-( >> >> > > The backup should work ok. The postmaster was closed every night for > file-backup. > > The vacuum raised a "max_fsm_pages" of 142000 not enought and stopped. > > Is increasing the number enought to have it continue or other parameters > are required? (Or is there a way in 8.1 to increate the memory for > maintenance?) (Is there a quick hint to calculate the size required?) > > Spec of the Server: > - Windows Server 2003 / 32 bits > - 3 GB ram > > (Now I understand why an initial DB of 6 GB is now 38 GB: vacuuming has > been stopped and space wasted since!) > > As a side question, is it possible to make a pg_dumpall on a DB that > could have been potentially damaged by the two postgres.exe executions > at the same time? (We did play arround with file read-only state in the > /base folder but not in this purpose: it was to make sure the DB was not > read only. Maybe the error message arrived after this manipulation, I > can't remember. But yes the two postgres program executed on the same > "base" folder, but not the same DB.) > > Maybe our best solution is start over from the backup. > >>> Also, the >>> "full-database vacuum" terminology seems too likely to be >>> interpreted as VACUUM FULL for best results. Perhaps it's worth >>> changing that to just "database vacuum" or "vacuum of the entire >>> database"? >>> >> We did change that ... >> http://archives.postgresql.org/pgsql-committers/2008-12/msg00096.php >> >> > > That is great. > > -- > Alexandre Leclerc
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
"Kevin Grittner"
Дата:
Alexandre Leclerc <aleclerc@ipso.ca> wrote: > The vacuum raised a "max_fsm_pages" of 142000 not enought and > stopped. That's probably just a warning that it wasn't able to track all the dead space -- I would expect that. You're going to want to clean up the bloat anyway. I would try a pg_dumpall at this point. If it works, you might just be in good shape. I would try hard to keep the old database when restoring it, and I recommend VACUUM ANALYZE after restoring it. Then I would do some sanity checks to make sure it looks like all the data is there and sound. If it doesn't let you in with normal connections because other databases are close to wrap-around, you'll have to vacuum those, but that should be fast. However, before any other vacuums on that cluster, I would try to copy the database off to some backup medium (with the service stopped). If you have time, it's a very good idea anyway. I hope it all works out. -Kevin
Alexandre Leclerc <aleclerc@ipso.ca> writes: > The vacuum raised a "max_fsm_pages" of 142000 not enought and stopped. That's just a warning that gets put out at the end of the run. Go on with vacuuming your other databases. Right now is no time to be worrying about FSM too small --- you need to get back to a running DB. regards, tom lane
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Alexandre Leclerc
Дата:
Le 2010-04-16 16:14, Tom Lane a écrit : > Alexandre Leclerc<aleclerc@ipso.ca> writes: > >> The vacuum raised a "max_fsm_pages" of 142000 not enought and stopped. >> > That's just a warning that gets put out at the end of the run. Go on > with vacuuming your other databases. Right now is no time to be > worrying about FSM too small --- you need to get back to a running DB. > > regards, tom lane > > Robin, Tom, Kevin, Thank you guys. I wanted to rush and vacuum the other tables and try, but I decided to make a copy. This is actually running. (Enough mistakes in one day to not take the time to do it.) After that we try to launch the DB and hopefully it will be working good enough before the next maintenance. Else, a big week-end is coming. As for restoring the old DB, if we must go there, we will keep a copy for sure (the copy we are making right now). Guys, thank you very much! Your help is invaluable. Best regards, -- Alexandre Leclerc
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Scott Marlowe
Дата:
On Fri, Apr 16, 2010 at 2:57 PM, Alexandre Leclerc <aleclerc@ipso.ca> wrote: > Thank you guys. I wanted to rush and vacuum the other tables and try, but I > decided to make a copy. This is actually running. (Enough mistakes in one > day to not take the time to do it.) > > After that we try to launch the DB and hopefully it will be working good > enough before the next maintenance. Else, a big week-end is coming. Don't forget to schedule an upgrade to at least 8.2 since 8.1 is considered broken and unsupported / unsupportable on windows. 8.3 changes some casting behaviour so you might want to wait until you can test / fix code to go there, but 8.2 is usually a painless upgrade from 8.1
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel &transaction "liberation"
От
"Joshua D. Drake"
Дата:
On Fri, 2010-04-16 at 14:47 -0400, Alexandre Leclerc wrote: > I did. :( Shame on me. I just realised while reading doc on postgres > that it is not made for that but only for a single instance at the time. > I hope I did not break anything. How in the world did you pull that off? PostgreSQL checks for that. Either way, if you actually managed to start two services against the same data directory, I hope you have a backup, you can restore from. Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564 Consulting, Training, Support, Custom Development, Engineering
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Alexandre Leclerc
Дата:
Le 2010-04-16 19:19, Scott Marlowe a écrit : > On Fri, Apr 16, 2010 at 2:57 PM, Alexandre Leclerc<aleclerc@ipso.ca> wrote: > >> Thank you guys. I wanted to rush and vacuum the other tables and try, but I >> decided to make a copy. This is actually running. (Enough mistakes in one >> day to not take the time to do it.) >> >> After that we try to launch the DB and hopefully it will be working good >> enough before the next maintenance. Else, a big week-end is coming. >> > Don't forget to schedule an upgrade to at least 8.2 since 8.1 is > considered broken and unsupported / unsupportable on windows. 8.3 > changes some casting behaviour so you might want to wait until you can > test / fix code to go there, but 8.2 is usually a painless upgrade > from 8.1 > Thank you Scott for this note. I wanted to give some feedback on the situation: - The vacuum completed after about 6-7 hours (by 16:15 PM). We immediately made a backup before proceeding further. Then we vacuumed the postgres et template1 databases. - We were then successful at restarting postmaster and working with the database. Everything works as expected. Among all the help received (thank you all) I want to specially thank Kevin and Tom for their excellent help and time through this urgent problem we had to fix. I want to underline their detailed and precise posts that were of great value for acting and taking decisions to recover the DB. Thank you guys! Again, thank you to all. Best regards, -- Alexandre Leclerc
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Alexandre Leclerc
Дата:
Le 2010-04-19 08:59, Alexandre Leclerc a écrit : > Le 2010-04-16 19:19, Scott Marlowe a écrit : >> On Fri, Apr 16, 2010 at 2:57 PM, Alexandre Leclerc<aleclerc@ipso.ca> >> wrote: >>> Thank you guys. I wanted to rush and vacuum the other tables and >>> try, but I >>> decided to make a copy. This is actually running. (Enough mistakes >>> in one >>> day to not take the time to do it.) >>> >>> After that we try to launch the DB and hopefully it will be working >>> good >>> enough before the next maintenance. Else, a big week-end is coming. >> Don't forget to schedule an upgrade to at least 8.2 since 8.1 is >> considered broken and unsupported / unsupportable on windows. 8.3 >> changes some casting behaviour so you might want to wait until you can >> test / fix code to go there, but 8.2 is usually a painless upgrade >> from 8.1 > > Thank you Scott for this note. > > I wanted to give some feedback on the situation: > > - The vacuum completed after about 6-7 hours (by 16:15 PM). We > immediately made a backup before proceeding further. Then we vacuumed > the postgres et template1 databases. > > - We were then successful at restarting postmaster and working with > the database. Everything works as expected. > > Among all the help received (thank you all) I want to specially thank > Kevin and Tom for their excellent help and time through this urgent > problem we had to fix. I want to underline their detailed and precise > posts that were of great value for acting and taking decisions to > recover the DB. Thank you guys! > > Again, thank you to all. > > Best regards, > I knew I missed one -- yet not forgotten! **Thank you Greg!** You were the first to post and it guided me for the immediate action I had to take. Best regards, -- Alexandre Leclerc
Re: Vacuum Full (PG 8.1) - Urgent help needed - Cancel & transaction "liberation"
От
Bruce Momjian
Дата:
Alexandre Leclerc wrote: > Thank you Scott for this note. > > I wanted to give some feedback on the situation: > > - The vacuum completed after about 6-7 hours (by 16:15 PM). We > immediately made a backup before proceeding further. Then we vacuumed > the postgres et template1 databases. > > - We were then successful at restarting postmaster and working with the > database. Everything works as expected. > > Among all the help received (thank you all) I want to specially thank > Kevin and Tom for their excellent help and time through this urgent > problem we had to fix. I want to underline their detailed and precise > posts that were of great value for acting and taking decisions to > recover the DB. Thank you guys! As another followup item, it would be good for your organization to get some Postgres training and a commercial support contract from someone who is paid to help you in such a crisis, or help you long before it becomes a crisis. These kind of problems make everyone look bad. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com