Re: Duplicate deletion optimizations
От | Marc Eberhard |
---|---|
Тема | Re: Duplicate deletion optimizations |
Дата | |
Msg-id | CAPaGL57q3h+Z7d=aRSzPrx4Uu0A9w8+eMZ+p=B1FkHKFe22KoA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Duplicate deletion optimizations (Samuel Gendler <sgendler@ideasculptor.com>) |
Ответы |
Re: Duplicate deletion optimizations
|
Список | pgsql-performance |
Hi Samuel! On 6 January 2012 20:02, Samuel Gendler <sgendler@ideasculptor.com> wrote: > Have you considered doing the insert by doing a bulk insert into a temp > table and then pulling rows that don't exist across to the final table in > one query and updating rows that do exist in another query? I did a very > brief scan of the SO thread and didn't see it suggested. Something like > this: > > update stats_5mn set count = count + t.count > from temp_table t > where stats_5mn.t_value = t.t_value and stats_5mn.t_record and > stats_5mn.output_id = t.output_id; > > insert into stats_5mn > select * from temp_table t > where not exists ( > select 1 from stats_5mn s > where s.t_value = t.t_value and s.t_record = t.t_record and s.output_id = > t.output_id > ); > > drop table temp_table; Am I right to assume that the update/insert needs to be placed into a begin / end transaction block if such batch uploads might happen concurrently? Doesn't seem to be the case for this question here, but I like the solution and wonder if it works under more general circumstances. What's the overhead of creating and dropping a temporary table? Is it only worth doing this for a large number of inserted/updated elements? What if the number of inserts/updates is only a dozen at a time for a large table (>10M entries)? Thanks, Marc
В списке pgsql-performance по дате отправления: