Re: reducing IO and memory usage: sending the content of a table to multiple files
От | Sam Mason |
---|---|
Тема | Re: reducing IO and memory usage: sending the content of a table to multiple files |
Дата | |
Msg-id | 20090402162755.GM12225@frubble.xen.chris-lamb.co.uk обсуждение исходный текст |
Ответ на | reducing IO and memory usage: sending the content of a table to multiple files (Ivan Sergio Borgonovo <mail@webthatworks.it>) |
Ответы |
Re: reducing IO and memory usage: sending the content of
a table to multiple files
|
Список | pgsql-general |
On Thu, Apr 02, 2009 at 11:20:02AM +0200, Ivan Sergio Borgonovo wrote: > This is the work-flow I've in mind: > > 1a) take out *all* data from a table in chunks (M record for each > file, one big file?) (\copy??, from inside a scripting language?) What about using cursors here? > 2a) process each file with awk to produce N files very similar each > other (substantially turn them into very simple xml) > 3a) gzip them GZIP uses significant CPU time; there are various lighter weight schemes available that may be better depending on where this data is going. > 2b) use any scripting language to process and gzip them avoiding a > bit of disk IO What disk IO are you trying to save and why? > Does PostgreSQL offer me any contrib, module, technique... to save > some IO (and maybe disk space for temporary results?). > > Are there any memory usage implication if I'm doing a: > pg_query("select a,b,c from verylargetable; --no where clause"); > vs. > the \copy equivalent > any way to avoid them? As far as I understand it will get all the data from the database into memory first and then your code gets a chance. For large datasets this obviously doesn't work well. CURSORs are you friend here. -- Sam http://samason.me.uk/
В списке pgsql-general по дате отправления: