Re: Insert performance (OT?)
От | Richard Huxton |
---|---|
Тема | Re: Insert performance (OT?) |
Дата | |
Msg-id | 42DCCA3B.8070100@archonet.com обсуждение исходный текст |
Ответ на | Insert performance (OT?) (Yves Vindevogel <yves.vindevogel@implements.be>) |
Список | pgsql-performance |
Yves Vindevogel wrote: > Hi, > > Suppose I have a table with 4 fields (f1, f2, f3, f4) > I define 2 unique indexes u1 (f1, f2, f3) and u2 (f1, f2, f4) > > I have 3 records > A, B, C, D (this will be inserted) > A, B, C, E (this will pass u2, but not u1, thus not inserted) > A, B, F, D (this will pass u1, but not u2, thus not inserted) Are you saying you want to know whether they will be inserted before you try to do so? > Now, for performance ... > > I have tables like this with 500.000 records where there's a new upload > of approx. 20.000 records. > It is only now that we say index u2 to be necessary. So, until now, I > did something like insert into ... select f1, f2, f2, max(f4) group by > f1, f2, f3 > That is ok ... and also logically ok because of the data definition I'm confused here - assuming you meant "select f1,f2,f3", then I don't see how you guarantee the row doesn't alredy exist. > I cannot do this with 2 group by's. I tried this on paper and I'm not > succeeding. I don't see how you can have two group-by's, or what that would mean if you did. > So, I must use a function that will check against u1 and u2, and then > insert if it is ok. > I know that such a function is way slower that my insert query. So - you have a table, called something like "upload" with 20,000 rows and you'd like to know whether it is safe to insert them. Well, it's easy enough to identify which ones are duplicates. SELECT * FROM upload JOIN main_table ON u1=f1 AND u2=f2 AND u3=f3; SELECT * FROM upload JOIN main_table ON u1=f1 AND u2=f2 AND u3=f4; Are you saying that deleting these rows and then inserting takes too long? -- Richard Huxton Archonet Ltd
В списке pgsql-performance по дате отправления: