Re: Database Caching
От | F Harvell |
---|---|
Тема | Re: Database Caching |
Дата | |
Msg-id | 200203051635.g25GZW520476@odin.fts.net обсуждение исходный текст |
Ответ на | Re: Database Caching (Hannu Krosing <hannu@krosing.net>) |
Список | pgsql-hackers |
On 01 Mar 2002 10:24:39 +0500, Hannu Krosing wrote: > ... > > > I don't see how result caching can be a win, since it can be done when > > needed anyway, without adding complexity to the database engine. Just > > have the application cache the result set. Certainly a web server could > > do this, if needed. > > More advanced application server can do it all right. But you still need > sound cache invalidation mechanisms. > > > If there were a way to mark a database as read only (and this could not > > be changed unless the entire database is shut down and restarted in > > read/write mode) then there might be some utility to result set cache. > > Otherwise, I think it will be wasted effort. It might be worthwhile to > > do the same for individual tables (with the same sort of restrictions). > > But think of all the effort that would be needed to do this properly, > > and what sort of payback would be received from it? > > The payback will be blazingly fast slashdot type applications with very > little effort from end application programmer. > > > Again, the same goals can easily be accomplished without having to > > perform major surgery on the database system. I suspect that there is > > some logical reason that Oracle/Sybase/IBM/Microsoft have not bothered > > with it. > > I think they were designed/developed when WWW was nonexistent, and > client-server meant a system where client and server where separated by > a (slow) network connection that would negate most of the benefit from > server side cacheing. On todays application server scenario the client > (AS) and server (DB) are usually very close, if not on the same computer > and thus effectively managed cache can be kept on DB to avoid all the > cache invalidation logic going through DB-AS link. > > ... You have identified one of the most valuable reasons for query/result caching. As I read the threads going on about caching, it reminds me of the arguments within MySQL about the need for referential integrity and transactions. Yes, the application can do it, however, by having the database do it, it frees the application programmer to perform more important tasks (e.g., more features). Your insights about the caching, etc. in the older, legacy databases is very likely the case. With the development of high speed networking, etc., it is now very feasible to move the caching closer to the sources of change/invalidation. Quite frankly, while certainly there are literally millions of applications that would find minimal value in query/results caching, I know that, at least for web applications, that the query/results caching _would_ be very valuable. Certainly, as has been pointed out several times, the application can do the caching, however, utilizing todays "generic" tools such as Apache and PHP, it is relatively difficult. By moving caching to the database server, there is little that needs to be done by the application to realize the benefits. For example, with my typical application, as a straight dynamic application, the process would be approximately: 1) receive the request "http://www.xyz.com/index.php" 2) query the current page contents from the database: select * from table where dateline = '2002-03-05'; 3) begin the HTML output 4) loop through the select results printing the contents 5) complete the HTML output With caching within the database, this would likely achieve performance good enough to serve millions of requests per day. (This is the case for some of our sites using Intersystems Cache which does do query caching and serves over 2 million page views per day.) Without the database caching, it is necessary to have the application perform this function. Because of the short life of an Apache/PHP process, caching needs to be performed externally to the application: 1) receive the request "http://www.xyz.com/index.php" 2) check the age of the cache file (1min, 5min ???): 3) if the cache is not fresh: 3.1) lock the cache file 3.2) query the database 3.3) begin HTML output to the cache file 3.4) loop through select results 3.5) complete the HTML output to the cache file 3.6) unlock the cache file 4) open the cache file (i.e., wait for locked file) 5) read/passthrough cache contents (Please note that this is one, simplified way to do the caching. It assumes that the data becomes stale over time and needs to be refreshed. It also uses a file cache instead of a memory cache. Certainly things could be made more efficient through the use of notifications from the database and the use of shared memory. Another path would be to have an external program generating the pages and then placing the "static" pages into service (which requires changes to apache and/or the OS due to cache file invalidation during the generation process). Both paths make the application more complex requiring more programming time.) It is possible to have an application server do this caching for you. Of course, that in turn has its own complications. For applications that do not need the added "features" of an application server, database caching can be a big win in both processing/response time as well as programmer time. Thanks, F Harvell
В списке pgsql-hackers по дате отправления: