Re: large dataset with write vs read clients
От | Craig Ringer |
---|---|
Тема | Re: large dataset with write vs read clients |
Дата | |
Msg-id | 4CB16080.1050406@postnewspapers.com.au обсуждение исходный текст |
Ответ на | Re: large dataset with write vs read clients (Mladen Gogala <mladen.gogala@vmsinfo.com>) |
Ответы |
Re: large dataset with write vs read clients
|
Список | pgsql-performance |
On 10/10/2010 5:35 AM, Mladen Gogala wrote: > I have a logical problem with asynchronous commit. The "commit" command > should instruct the database to make the outcome of the transaction > permanent. The application should wait to see whether the commit was > successful or not. Asynchronous behavior in the commit statement breaks > the ACID rules and should not be used in a RDBMS system. If you don't > need ACID, you may not need RDBMS at all. You may try with MongoDB. > MongoDB is web scale: http://www.youtube.com/watch?v=b2F-DItXtZs That argument makes little sense to me. Because you can afford a clearly defined and bounded loosening of the durability guarantee provided by the database, such that you know and accept the possible loss of x seconds of work if your OS crashes or your UPS fails, this means you don't really need durability guarantees at all - let alone all that atomic commit silliness, transaction isolation, or the guarantee of a consistent on-disk state? Some of the other flavours of non-SQL databases, both those that've been around forever (PICK/UniVerse/etc, Berkeley DB, Cache, etc) and those that're new and fashionable Cassandra, CouchDB, etc, provide some ACID properties anyway. If you don't need/want an SQL interface to your database you don't have to throw out all that other database-y goodness if you haven't been drinking too much of the NoSQL kool-aid. There *are* situations in which it's necessary to switch to relying on distributed, eventually-consistent databases with non-traditional approaches to data management. It's awfully nice not to have to, though, and can force you to do a lot more wheel reinvention when it comes to querying, analysing and reporting on your data. FWIW, a common approach in this sort of situation has historically been - accepting that RDBMSs aren't great at continuous fast loading of individual records - to log the records in batches to a flat file, Berkeley DB, etc as a staging point. You periodically rotate that file out and bulk-load its contents into the RDBMS for analysis and reporting. This doesn't have to be every hour - every minute is usually pretty reasonable, and still gives your database a much easier time without forcing you to modify your app to batch inserts into transactions or anything like that. -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/
В списке pgsql-performance по дате отправления: