Re: pg_dump additional options for performance
| От | Simon Riggs |
|---|---|
| Тема | Re: pg_dump additional options for performance |
| Дата | |
| Msg-id | 1204031111.4252.255.camel@ebony.site обсуждение исходный текст |
| Ответ на | Re: pg_dump additional options for performance ("Tom Dunstan" <pgsql@tomd.cc>) |
| Ответы |
Re: pg_dump additional options for performance
|
| Список | pgsql-hackers |
On Tue, 2008-02-26 at 18:19 +0530, Tom Dunstan wrote: > On Tue, Feb 26, 2008 at 5:35 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > > On Tue, 2008-02-26 at 12:46 +0100, Dimitri Fontaine wrote: > > > As a user I'd really prefer all of this to be much more transparent, and could > > > well imagine the -Fc format to be some kind of TOC + zip of table data + post > > > load instructions (organized per table), or something like this. > > > In fact just what you described, all embedded in a single file. > > > > If its in a single file then it won't perform as well as if its separate > > files. We can put separate files on separate drives. We can begin > > reloading one table while another is still unloading. The OS will > > perform readahead for us on single files whereas on one file it will > > look like random I/O. etc. > > Yeah, writing multiple unknown-length streams to a single file in > parallel is going to be all kinds of painful, and this use case seems > to be the biggest complaint against a zip file kind of approach. I > didn't know about the custom file format when I suggested the zip file > one yesterday*, but a zip or equivalent has the major benefit of > allowing the user to do manual inspection / tweaking of the dump > because the file format is one that can be manipulated by standard > tools. And zip wins over tar because it's indexed - if you want to > extract just the schema and hack on it you don't need to touch your > multi-GBs of data. > > Perhaps a compromise: we specify a file system layout for table data > files, pre/post scripts and other metadata that we want to be made > available to pg_restore. By default, it gets dumped into a zip file / > whatever, but a user who wants to get parallel unloads can pass a flag > that tells pg_dump to stick it into a directory instead, with exactly > the same file layout. Or how about this: if the filename given to > pg_dump is a directory, spit out files in there, otherwise > create/overwrite a single file. > > While it's a bit fiddly, putting data on separate drives would then > involve something like symlinking the tablename inside the dump dir > off to an appropriate mount point, but that's probably not much worse > than running n different pg_dump commands specifying different files. > Heck, if you've got lots of data and want very particular behavior, > you've got to specify it somehow. :) Separate files seems much simpler... -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
В списке pgsql-hackers по дате отправления: