Loading speed
От | Cath Lawrence |
---|---|
Тема | Loading speed |
Дата | |
Msg-id | FC0064C3-0436-11D8-B7DB-00039390F614@anu.edu.au обсуждение исходный текст |
Ответы |
Re: Loading speed
Re: Loading speed |
Список | pgsql-novice |
Hi again, Is there an FAQ or something I can read about optimising speed of insertions and queries? I am doing some simple-minded stuff at the moment with pgsql, and my insertions are very very slow. A rough idea of the problem here for background: A protein has a small number of chains ('A', 'B' etc), each of which has N2 amino acids, each of which has N3 atoms. Now I've got lots and lots of atoms. Hundreds of thousands most like, maybe a million or two, I haven't counted. How can I most efficiently load these? The atom table is a bit like this with some language changes: CREATE TABLE atom ( id INTEGER PRIMARY KEY DEFAULT nextval('atom_id_seq'), protein_id CHAR(4) REFERENCES protein, chain_id CHAR(1) NOT NULL, amino_acid_id INTEGER REFERENCES amino_acid, atom_type TEXT, element TEXT, x FLOAT, y FLOAT, z FLOAT ); Now the REFERENCES bits build triggers, and in each case it refers to the primary key of the table. This is always indexed, yes? When I construct the table it says CREATE TABLE / PRIMARY KEY will create implicit index 'amino_acid_pkey' for table 'amino_acid' - this IS an index on the column I named as the PRIMARY KEY, not on some oid, isn't id? With my simple-minded pygresql script, I have one insert statement per atom, and of course it crawls very feebly. Perhaps there's an insert_many method I haven't found yet (the docs suck, I will keep looking). Or perhaps I should write them out to a text file and use COPY? Any advice? cheers Cath Cath Lawrence, Cath.Lawrence@anu.edu.au Senior Scientific Programmer, Centre for Bioinformation Science, John Curtin School of Medical Research (room 4088) Australian National University, Canberra ACT 0200 ph: (02) 61257959 mobile: 0421-902694 fax: (02) 61252595
В списке pgsql-novice по дате отправления: