Обсуждение: PostgreSQL Data Loss

Поиск
Список
Период
Сортировка

PostgreSQL Data Loss

От
BluDes
Дата:
Hi everyone, I have a problem with one of my costomers.
I made a program that uses a PostgreSQL (win32) database to save its data.
My customer claims that he lost lots of data reguarding his own clients 
and that those data had surely been saved on the database.
My first guess is that he is the one who deleted the data but wants to 
blame someone else, obviously I can't prove it.

Could it be possible for PostgreSQL to lose its data? Maybe with a file 
corruption? Could it be possible to restore these data?

My program does not modify or delete data since its more like a log that 
only adds information. It is obviously possible to delete these logs but 
it requires to answer "yes" to 2 different warnings, so the data can't 
be deleted accidentally.

I have other customers with even 10 times the amount of data of the one 
who claimed the loss but no problems with them.
He obviously made no backups (and claims whe never told him to do them 
so we are responsible even for this) though the program has a dedicated 
Backup-section.

Any suggestion?

Daniele


Re: PostgreSQL Data Loss

От
Heikki Linnakangas
Дата:
BluDes wrote:
> I made a program that uses a PostgreSQL (win32) database to save its data.

What version of PostgreSQL is this?

> My customer claims that he lost lots of data reguarding his own clients 
> and that those data had surely been saved on the database.
> My first guess is that he is the one who deleted the data but wants to 
> blame someone else, obviously I can't prove it.

Did he lose all data in one table, or just some rows? Or is there some 
other pattern?

> Could it be possible for PostgreSQL to lose its data? 

Not when properly installed.

> Maybe with a file corruption? 

I doubt it. You'd almost certainly get warnings or errors if there's 
corruption.

> Could it be possible to restore these data?

The first thing to do is to take a filesystem-level physical copy of the 
data directory to prevent further damage. Copy the data directory to 
another system for forensics.

You might be able to get a picture of what happened by looking at the 
WAL logs using the xlogviewer tool in pgfoundry.

You can also modify the PostgreSQL source code so that it shows also row 
versions marked as deleted, and recover the deleted data. I can't 
remember exactly how to do it, maybe others who have done it can fill 
in. A row stays physically in the file until the table is vacuumed; 
hopefully it hasn't been.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: PostgreSQL Data Loss

От
Zdenek Kotala
Дата:
If data are deleted then they are still stored in database until VACUUM 
cleans them. You can look by some hex viewer, if you see some know text 
data there. Or I think there is also some tool which dump tuple list 
from pages.

You can also see deleted data if you change current transaction ID. But 
I not sure if it is simply possible.

Before experiments, do not forget backup of database files.

Zdenek

BluDes wrote:
> Hi everyone,
>  I have a problem with one of my costomers.
> I made a program that uses a PostgreSQL (win32) database to save its data.
> My customer claims that he lost lots of data reguarding his own clients 
> and that those data had surely been saved on the database.
> My first guess is that he is the one who deleted the data but wants to 
> blame someone else, obviously I can't prove it.
> 
> Could it be possible for PostgreSQL to lose its data? Maybe with a file 
> corruption? Could it be possible to restore these data?
> 
> My program does not modify or delete data since its more like a log that 
> only adds information. It is obviously possible to delete these logs but 
> it requires to answer "yes" to 2 different warnings, so the data can't 
> be deleted accidentally.
> 
> I have other customers with even 10 times the amount of data of the one 
> who claimed the loss but no problems with them.
> He obviously made no backups (and claims whe never told him to do them 
> so we are responsible even for this) though the program has a dedicated 
> Backup-section.
> 
> Any suggestion?
> 
> Daniele
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings



Re: PostgreSQL Data Loss

От
"J. Andrew Rogers"
Дата:
On Jan 26, 2007, at 2:22 AM, BluDes wrote:
>  I have a problem with one of my costomers.
> I made a program that uses a PostgreSQL (win32) database to save  
> its data.
> My customer claims that he lost lots of data reguarding his own  
> clients and that those data had surely been saved on the database.
> My first guess is that he is the one who deleted the data but wants  
> to blame someone else, obviously I can't prove it.
>
> Could it be possible for PostgreSQL to lose its data? Maybe with a  
> file corruption? Could it be possible to restore these data?
>
> My program does not modify or delete data since its more like a log  
> that only adds information. It is obviously possible to delete  
> these logs but it requires to answer "yes" to 2 different warnings,  
> so the data can't be deleted accidentally.
>
> I have other customers with even 10 times the amount of data of the  
> one who claimed the loss but no problems with them.
> He obviously made no backups (and claims whe never told him to do  
> them so we are responsible even for this) though the program has a  
> dedicated Backup-section.


I have seen this data loss pattern many, many times, and on Oracle  
too.  The most frequent culprits in my experience:

1.)  The customer screwed up big time and does not want to admit that  
they made a mistake, hoping you can somehow pull their butt out of  
the fire for free.

2.)  Someone else sabotaged or messed up the customers database, and  
the customer is not aware of it.

3.)  The customer deleted their own data and is oblivious to the fact  
that they are responsible.

4.)  There is some rare edge case in your application that generates  
SQL that deletes all the data.


There is always the possibility that there is in fact some data loss  
due to a failure of the database, but it is a rare kind of corruption  
that deletes a person's data but leaves everything else intact with  
no error messages, warnings, or other indications that something is  
not right.  Given the description of the problem, I find an internal  
failure of the database to be a low probability reason for the data  
loss.


Having run many database systems that had various levels of pervasive  
internal change auditing/versioning, often unbeknownst to the casual  
user, virtually all of the several "data loss" cases I've seen with a  
description like the above clearly fit in the cases #1-3 above when  
we went into the audit logs i.e. someone explicitly did the  
deleting.  I cannot tell you how many times people have tried to  
pretend that the database "lost" or "messed up" their data and then  
been embarrassed when they discover that I can step through every  
single action they took to destroy their own data.  I've never seen a  
single case like the one described above that was due to an internal  
database failure; when there is an internal database failure, it is  
usually ugly and obvious.

Cheers,

J. Andrew Rogers
jrogers@neopolitan.com





Re: PostgreSQL Data Loss

От
Andrew Dunstan
Дата:
BluDes wrote:
> Hi everyone,
>  I have a problem with one of my costomers.
> I made a program that uses a PostgreSQL (win32) database to save its 
> data.
> My customer claims that he lost lots of data reguarding his own 
> clients and that those data had surely been saved on the database.
> My first guess is that he is the one who deleted the data but wants to 
> blame someone else, obviously I can't prove it.
>
> Could it be possible for PostgreSQL to lose its data? Maybe with a 
> file corruption? Could it be possible to restore these data?
>
> My program does not modify or delete data since its more like a log 
> that only adds information. It is obviously possible to delete these 
> logs but it requires to answer "yes" to 2 different warnings, so the 
> data can't be deleted accidentally.
>
> I have other customers with even 10 times the amount of data of the 
> one who claimed the loss but no problems with them.
> He obviously made no backups (and claims whe never told him to do them 
> so we are responsible even for this) though the program has a 
> dedicated Backup-section.
>
> Any suggestion?
>
>

This isn't any sort of report that can be responded to. We need to know 
what has happened to the machine, what is in the server logs, what are 
the symptoms of data loss. The most likely explanations are pilot error 
and hardware error.

cheers

andrew



Re: PostgreSQL Data Loss

От
Gregory Stark
Дата:
"BluDes" <DESPAMMAMIdarocchi@PERFAVOREtiscali.it> writes:

> My customer claims that he lost lots of data reguarding his own clients and
> that those data had surely been saved on the database.

Has this Postgres database been running for a long time? There is a regular
job called VACUUM that has to be run on every table periodically to recover
free space. 

If this isn't run for a very long time (how long depends on how busy the
database is, but even on extremely large databases it's usually a matter of
months, on more normal databases it would be years) then very old records seem
to suddenly disappear. There is a way to recover data that this has happened
to though as long as you don't run vacuum after the data has disappeared.

To repeat: If you think this may have happened DO NOT run vacuum now. 

Do you think this may have happened? How long ago was this database created?
Does your system periodically run VACUUM? Is the missing data in every table
or just a particular table?

Incidentally recent versions of Postgres don't allow this to occur and stop
running with a message insisting you run vacuum before continuing. 

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: PostgreSQL Data Loss

От
Martijn van Oosterhout
Дата:
On Sat, Jan 27, 2007 at 12:11:59AM +0000, Gregory Stark wrote:
> If this isn't run for a very long time (how long depends on how busy the
> database is, but even on extremely large databases it's usually a matter of
> months, on more normal databases it would be years) then very old records seem
> to suddenly disappear. There is a way to recover data that this has happened
> to though as long as you don't run vacuum after the data has disappeared.
>
> To repeat: If you think this may have happened DO NOT run vacuum now.

Actually, for XID wraparound a VACUUM may actually be the right thing.
I looked at this (with guidence from Tom) and we came to the conclusion
that XID wraparound will hide tuples older than 2 billion transaction,
but VACUUM will mark as frozen anything newer than 3 billion
transactions, so for 1 billion transactions you can actually get your
data back.

Expect for things like uniqueness guarentees, but they're solvable.

Not that I'm saying that the OP has this issue...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: PostgreSQL Data Loss

От
desrocchi@gmail.com
Дата:

On 27 Gen, 06:31, klep...@svana.org (Martijn van Oosterhout) wrote:

> > To repeat: If you think this may have happened DO NOT run vacuum now.Actually, for XID wraparound a VACUUM may
actuallybe the right thing.
 
> I looked at this (with guidence from Tom) and we came to the conclusion
> that XID wraparound will hide tuples older than 2 billion transaction,
> but VACUUM will mark as frozen anything newer than 3 billion
> transactions, so for 1 billion transactions you can actually get your
> data back.
>
> Expect for things like uniqueness guarentees, but they're solvable.

Hello,thank you all for the help.
@Andrew Dunstan: this is the first time I'm having this kind of 
problem with PostgreSQL, I'm sorry I didn't provide all the needed 
information.
Let me try to fill in something:
- the postgresql version is 8.1.4-1
- as far as I know, nothing happened to the machine. I work near 
Milan, my customer is from something between Rome and Tuscany. It 
would be a long jurney to retrieve a PC that he surely won't give us.
- The server logs... huh? Never heard of them... or better, never 
needed. Where can I find them?

There is even a more foolish explanation to all of this, but my 
customer denied this happened:
in my program it is possible to deactivate the auto-save function of 
the work done. Without this option the user has to click himself the 
button to store the data on the database... so it could even be that 
I'm trying to find data that has never even been saved.

Anyway this teaches me that I have to put logs in my programs to trace 
every single time the users change settings.

Bye,
Daniele