Re: huge price database question..
От | David Kerr |
---|---|
Тема | Re: huge price database question.. |
Дата | |
Msg-id | 4F6936E8.3040507@mr-paradox.net обсуждение исходный текст |
Ответ на | Re: huge price database question.. (Jim Green <student.northwestern@gmail.com>) |
Ответы |
Re: huge price database question..
|
Список | pgsql-general |
On 03/20/2012 06:50 PM, Jim Green wrote: > On 20 March 2012 21:40, David Kerr<dmk@mr-paradox.net> wrote: >> On 03/20/2012 04:27 PM, Jim Green wrote: >> >> Greetings list! >> I am pretty new to postgresql from mysql and did a fairly extensive >> search of the list and came up with a few good ones but didn't find >> the exact same situation as I have now. so I am venturing asking here. >> >> I have daily minute stock price data from 2005 on and each day with >> columns timestamp, open,high,low,close,volume and a few more. each >> day's data is about 1.2million rows. I want import all the data to >> postgresql and analyze using R with the help of Rpostgresql. >> >> right now I am having about 7000 tables for individual stock and I use >> perl to do inserts, it's very slow. I would like to use copy or other >> bulk loading tool to load the daily raw gz data. but I need the split >> the file to per stock files first before I do bulk loading. I consider >> this a bit messy. >> >> I would seek advise on the following idea: >> store everything in a big table, partition by month(this gives a >> reasonable number of partitions) and do bulk loading on the daily >> file. my queries would consist mostly select on a particular symbol on >> a particular day. >> >> Also in the future, I will import daily data to the db every day. >> >> my hardware is 16G Ram, 4x5400rpm raid10 with enough space. >> >> Thanks! >> >> Jim. >> >> >> Seems like you'd want to do this? >> http://search.cpan.org/~turnstep/DBD-Pg-2.19.2/Pg.pm#pg_putcopydata >> COPY support >> >> DBD::Pg allows for quick (bulk) reading and storing of data by using the >> COPY command. The basic process is to use $dbh->do to issue a COPY command, >> and then to either add rows using "pg_putcopydata", or to read them by using >> "pg_getcopydata". > Thanks! would you comment on the table setup as well? > > Jim. > \copy on 1.2million rows should only take a minute or two, you could make that table "unlogged" as well to speed it up more. If you could truncate / drop / create / load / then index the table each time then you'll get the best throughput. Dave
В списке pgsql-general по дате отправления: