Re: ETL optimization
От | Bricklen Anderson |
---|---|
Тема | Re: ETL optimization |
Дата | |
Msg-id | 42BB31F1.7000303@PresiNET.com обсуждение исходный текст |
Ответ на | Re: ETL optimization (Jacques Caron <jc@directinfos.com>) |
Список | pgsql-performance |
Jacques Caron wrote: > > I have a similar situation, and the solution I use (though I haven't > really tested many different situations): > - have a trigger ON INSERT which does: > UPDATE set whatever_value=NEW.whatever_value,... WHERE > whatever_key=NEW.whatever.key AND... > IF FOUND THEN > RETURN NULL; > ELSE > RETURN NEW; > END IF; > - use COPY > > For optimal performance, a different trigger function is created for > each table, which allows the query plan of the UPDATE to be cached. > > Let us know how that works out for you and if you find a better solution! > > Jacques. > Hi Jacques, thanks for the suggestion. I've previously tested triggers under a variety of situations and there was no way that they would work under the load we currently have, and the much greater load that we will be expecting soon (~40x increase in data). I'm in the process of testing the delete scenario right now, and at first blush seems to perform fairly well. 2.5 million rows before aggregation, and 171000 after, in a little under 7 minutes. Currently testing again with about 18.5 million rows. A drawback by using the delete method is that we cannot do any of the aggregation incrementally, but so far that hasn't been a big help anyways. I still need to test the performance of concurrent querying against the destination table whilst the aggregation is occurring. -- _______________________________ This e-mail may be privileged and/or confidential, and the sender does not waive any related rights and obligations. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. If you received this e-mail in error, please advise me (by return e-mail or otherwise) immediately. _______________________________
В списке pgsql-performance по дате отправления: