Re: Accidental removal of a file causing various problems
От | Tom Lane |
---|---|
Тема | Re: Accidental removal of a file causing various problems |
Дата | |
Msg-id | 23318.1535136399@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Accidental removal of a file causing various problems (Pavan Deolasee <pavan.deolasee@gmail.com>) |
Ответы |
Re: Accidental removal of a file causing various problems
|
Список | pgsql-hackers |
Pavan Deolasee <pavan.deolasee@gmail.com> writes: > 1. The user soon found out that they can no longer connect to any database > in the cluster. Not just the one to which the affected table belonged, but > no other database in the cluster. The affected table is a regular user > table (actually a toast table). Please define "can no longer connect". What happened *exactly*? How long did it take to start failing like that (was this perhaps a shutdown-because-of-impending-wraparound situation)? > 2. So they restarted the database server. While that fixed the connection > problem, they started seeing toast errors on the table to which the missing > file belonged to. The missing file was recreated at the database restart, > but of course it was filled in with all zeroes, causing data corruption. Doesn't seem exactly surprising, if some toast data went missing. > 3. To make things worse, the corruption then got propagated to the standbys > too. We don't know if the original file removal was replicated to the > standby, but it seems unlikely. This is certainly unsurprising. > I've a test case that reproduce all of these effects if a backend file is > forcefully removed, Let's see it. Note that this: > WARNING: could not write block 27094010 of base/56972584/56980980 > DETAIL: Multiple failures --- write error might be permanent. > ERROR: could not open file "base/56972584/56980980.69" (target block > 27094010): previous segment is only 12641 blocks > CONTEXT: writing block 27094010 of relation base/56972584/56980980 does not say that the .69 file is missing. It says that .68 (or, maybe, some even-earlier segment) was smaller than 1GB, which is a different matter. Still data corruption, but I don't think I believe it was a stray "rm". Oh, and what PG version are we talking about? regards, tom lane
В списке pgsql-hackers по дате отправления: