Re: Importing Large Amounts of Data
От | Curt Sampson |
---|---|
Тема | Re: Importing Large Amounts of Data |
Дата | |
Msg-id | Pine.NEB.4.43.0204151655080.439-100000@angelic.cynic.net обсуждение исходный текст |
Ответ на | Re: Importing Large Amounts of Data ("Christopher Kings-Lynne" <chriskl@familyhealth.com.au>) |
Ответы |
Re: Importing Large Amounts of Data
|
Список | pgsql-hackers |
On Mon, 15 Apr 2002, Christopher Kings-Lynne wrote: > OK, well now it depends on what kind of selects you're doing. Do you > regularly select over a certain subset of the data, in which case using > partial indices might give you significant speedup. I believe from the information I've been given that we will indeed be regularly selecting over certain subsets, based on day. (One of the test queries I've been asked to use selects based on user_id and a date range.) But I was intending to partition the tables based on date range (to keep the index rebuild time from getting completely out of hand), so that will handily take care of that requirement anyway. > Do you select functions of columns? No. > It depends on your definition. You have to accept a certain overhead if > you're to have data integrity and MVCC. If you can't handle that overhead, > then you can't have data integrity and vice versa. Well, a few points: a) I am not convinced that data integrity should cost a five-fold decrease in performance, b) In fact, at times I don't need that data integrity. I'm prefectly happy to risk the loss of a table during import, ifit lets me do the import more quickly, especially if I'm taking the database off line to do the import anyway. MS SQL serverin fact allows me to specify relaxed integrity (with attendant risks) when doing a BULK IMPORT; it would be cool ifPostgres allowed that to. > BTW, instead of: > > CREATE UNIQUE INDEX bigone_pkey ON bigone (rec_no); > > do: > > ALTER TABLE bigone ADD PRIMARY KEY(rec_no); > > And remember to run "VACUUM ANALYZE bigone;" or just "ANALYZE bigone;" after > the COPY and before trying to use the table. I'm not sure if it's better to > analyze before or after the indexes are added, but it's definitely better to > vaccum before the indexes are added. Thanks. This is the kind of useful information I'm looking for. I was doing a vacuum after, rather than before, generating the indices. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're alllight. --XTC
В списке pgsql-hackers по дате отправления: