Обсуждение: Forensic recovery deleted pgdump custom format file

Поиск
Список
Период
Сортировка

Forensic recovery deleted pgdump custom format file

От
David Guimaraes
Дата:
Hello. I need some help.

I have the following situation. My client deleted a number of old backups from a drive disc made by PGDUMP with custom flag activated. I could not find any program to recover backup files made by PGDUMP of customized / binary form. So I decided to try to understand the file format generated by pgdump. Analyzing the source code of pgdump/recovery, i concluded a few things:

The header of the file always starts with "PGDMP" followed by pgdump version number used, followed by int size, offset, etc. followed by TOCs.

My question is how to know the end of the file? Are there any signature that I can use? Or would have to analyze the whole file?

Thank u.

--
David

Re: Forensic recovery deleted pgdump custom format file

От
Michael Paquier
Дата:
On Tue, Jul 14, 2015 at 9:28 AM, David Guimaraes <skysbsb@gmail.com> wrote:
> So I decided to try to understand the file format generated by
> pgdump. Analyzing the source code of pgdump/recovery, i concluded a few
> things:
>
> The header of the file always starts with "PGDMP" followed by pgdump version
> number used, followed by int size, offset, etc. followed by TOCs.
>
> My question is how to know the end of the file? Are there any signature that
> I can use? Or would have to analyze the whole file?

Why are you trying to reinvent the wheel? pg_restore is not available?
-- 
Michael



Re: Forensic recovery deleted pgdump custom format file

От
David Guimaraes
Дата:
<p dir="ltr">The backups were deleted. I need them to use pg_restore. <div class="gmail_quote">Em 13/07/2015 21:18,
"MichaelPaquier" <<a href="mailto:michael.paquier@gmail.com">michael.paquier@gmail.com</a>> escreveu:<br
type="attribution"/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">OnTue, Jul 14, 2015 at 9:28 AM, David Guimaraes <<a
href="mailto:skysbsb@gmail.com">skysbsb@gmail.com</a>>wrote:<br /> > So I decided to try to understand the file
formatgenerated by<br /> > pgdump. Analyzing the source code of pgdump/recovery, i concluded a few<br /> >
things:<br/> ><br /> > The header of the file always starts with "PGDMP" followed by pgdump version<br /> >
numberused, followed by int size, offset, etc. followed by TOCs.<br /> ><br /> > My question is how to know the
endof the file? Are there any signature that<br /> > I can use? Or would have to analyze the whole file?<br /><br />
Whyare you trying to reinvent the wheel? pg_restore is not available?<br /> --<br /> Michael<br /></blockquote></div> 

Re: Forensic recovery deleted pgdump custom format file

От
Michael Paquier
Дата:
On Tue, Jul 14, 2015 at 10:58 AM, David Guimaraes <skysbsb@gmail.com> wrote:
> The backups were deleted. I need them to use pg_restore.

So what you mean is that you are looking at your disk at FS level to
find traces of those deleted backups by analyzing their binary
format... Am I missing something?
-- 
Michael



Re: Forensic recovery deleted pgdump custom format file

От
David Guimaraes
Дата:
<p dir="ltr">Yeah bingo<div class="gmail_quote">Em 13/07/2015 22:11, "Michael Paquier" <<a
href="mailto:michael.paquier@gmail.com">michael.paquier@gmail.com</a>>escreveu:<br type="attribution" /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Tue, Jul 14, 2015 at 10:58
AM,David Guimaraes <<a href="mailto:skysbsb@gmail.com">skysbsb@gmail.com</a>> wrote:<br /> > The backups were
deleted.I need them to use pg_restore.<br /><br /> So what you mean is that you are looking at your disk at FS level
to<br/> find traces of those deleted backups by analyzing their binary<br /> format... Am I missing something?<br />
--<br/> Michael<br /></blockquote></div> 

Re: Forensic recovery deleted pgdump custom format file

От
Michael Paquier
Дата:
On Tue, Jul 14, 2015 at 11:20 AM, David Guimaraes <skysbsb@gmail.com> wrote:
> Yeah bingo

Hm. While there is a magic-code header for the custom format, by
looking at the code I am not seeing any traces of a similar thing at
the end of the dump file (_CloseArchive in pg_backup_custom.c), and I
don't recall wither that there is an estimation of the size of the
dump either in the header. If those files were stored close to each
other, one idea may be to look for the next header present. or to
attempt to roughly estimate the size that they would have I am afraid.
In any case, applying reverse engineering methods seems like the most
reliable method to reconstitute an archive handler that could be used
by pg_restore or pg_dump, but perhaps others have other ideas.
-- 
Michael



Re: Forensic recovery deleted pgdump custom format file

От
David Guimaraes
Дата:
Yes Michael, I agree.

This is the CloseArchive function at pg_backup_custom.c

WriteHead(AH);
tpos = ftello(AH->FH);
WriteToc(AH);
ctx->dataStart = _getFilePos(AH, ctx);
WriteDataChunks(AH);

This is the WriteHead function at pg_backup_archiver.c:

(*AH->WriteBufPtr) (AH, "PGDMP", 5); /* Magic code */
(*AH->WriteBytePtr) (AH, AH->vmaj);
(*AH->WriteBytePtr) (AH, AH->vmin);
(*AH->WriteBytePtr) (AH, AH->vrev);
(*AH->WriteBytePtr) (AH, AH->intSize);
(*AH->WriteBytePtr) (AH, AH->offSize);
(*AH->WriteBytePtr) (AH, AH->format);
WriteInt(AH, AH->compression);
crtm = *localtime(&AH->createDate);
WriteInt(AH, crtm.tm_sec);
WriteInt(AH, crtm.tm_min);
WriteInt(AH, crtm.tm_hour);
WriteInt(AH, crtm.tm_mday);
WriteInt(AH, crtm.tm_mon);
WriteInt(AH, crtm.tm_year);
WriteInt(AH, crtm.tm_isdst);
WriteStr(AH, PQdb(AH->connection));
WriteStr(AH, AH->public.remoteVersionStr);
WriteStr(AH, PG_VERSION);

There is no mention to File Size or whatsoever in the Header..

WriteToc, however write the number of TOCs structs at the beginning:

void WriteToc(ArchiveHandle *AH) {
...
WriteInt(AH, tocCount);

but these structs are dynamic(linked list), so there is no way to know the size of each one...

At the definition of tocEntry struct, there is no reference to size or anything like that.. it is a linked list with a count number.

And at the end, the CloseArchive function calls WriteDataChunks to write blob information... i don't understand what this function is doing.. it save size information of blob data at the beginning? 

(*te->dataDumper) ((Archive *) AH, te->dataDumperArg);

What this function does?



David


On Mon, Jul 13, 2015 at 11:00 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Tue, Jul 14, 2015 at 11:20 AM, David Guimaraes <skysbsb@gmail.com> wrote:
> Yeah bingo

Hm. While there is a magic-code header for the custom format, by
looking at the code I am not seeing any traces of a similar thing at
the end of the dump file (_CloseArchive in pg_backup_custom.c), and I
don't recall wither that there is an estimation of the size of the
dump either in the header. If those files were stored close to each
other, one idea may be to look for the next header present. or to
attempt to roughly estimate the size that they would have I am afraid.
In any case, applying reverse engineering methods seems like the most
reliable method to reconstitute an archive handler that could be used
by pg_restore or pg_dump, but perhaps others have other ideas.
--
Michael



--
David Gomes Guimarães