Re: Need to update all my 60 million rows at once without transactional integrity
От | Tom Allison |
---|---|
Тема | Re: Need to update all my 60 million rows at once without transactional integrity |
Дата | |
Msg-id | BB3A6D34-4DBD-4347-BAA5-9947FE132BA9@tacocat.net обсуждение исходный текст |
Ответ на | Re: Need to update all my 60 million rows at once without transactional integrity (Simon Riggs <simon@2ndquadrant.com>) |
Список | pgsql-general |
Far from being an expert on postgres, but there are two ideas-- assuming that you cannot afford the time it would take to simply UPDATE and wait... Write a script to update all the rows, one at a time. Lowest impact to operations but would take a very long time. Assuming you have a sequence primary key value on each row, update by ID blocks on the order of 10,000's or 100,000's at a time (or more). This is a balancing act between time to complete and immediate load on the server. I've used both options but I don't think I've exceeded 15 million rows. Sent from my iPhone. On Apr 23, 2008, at 2:15 PM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Mon, 2008-04-21 at 00:19 +0200, christian_behrens@gmx.net wrote: > >> How can I make a Update of a column in a very large table for all >> rows >> without using the double amount of disc space and without any need >> for >> atomic operation? >> >> I have a very large table with about 60 million rows. I sometimes >> need >> to do a simple update to ALL rows that resets a status-flag to zero. >> >> I don't need to have transactional integrity (but of course if the >> system crashes, there should be no data corruption. > > No such thing. Without transactions you have no sensible definition of > what constitutes data corruption. > >> A separate flag in the file system can well save the fact that that >> bulk update was in progress) for this, I don't care or want a abort >> or >> "all or nothing" like SQL mandates. The idea is basically that either >> this update succeeds or it succeeds or - there is no "not". It must >> succeed. It must be tried until it works. If the update is halfway >> finished, that's okay. > > Don't reset them to zero, just redefine the meaning of the counter. > Take > the max value in the table and then have the app understand that > anything <= the previous max value means the same thing as whatever > "status = 0" means now. The app would need to read the new baseline > value before performing any work. > > This is roughly the technique used by Slony to avoid needing to update > every row in the log table to show that it has successfully replicated > it. It's also the technique MVCC relies upon internally. > > It's less work and crash safe in all cases. > > -- > Simon Riggs > 2ndQuadrant http://www.2ndQuadrant.com > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
В списке pgsql-general по дате отправления: