optimized counting of web statistics
От | Billy extyeightysix |
---|---|
Тема | optimized counting of web statistics |
Дата | |
Msg-id | 22455e7050628133978b17cd4@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: tricky query ("Merlin Moncure" <merlin.moncure@rcsonline.com>) |
Ответы |
Re: optimized counting of web statistics
Re: optimized counting of web statistics |
Список | pgsql-performance |
Hola folks, I have a web statistics Pg database (user agent, urls, referrer, etc) that is part of an online web survey system. All of the data derived from analyzing web server logs is stored in one large table with each record representing an analyzed webserver log entry. Currently all reports are generated when the logs are being analyzed and before the data ever goes into the large table I mention above. Well, the time has come to build an interface that will allow a user to make ad-hoc queries against the stats and that is why I am emailing the performance list. I need to allow the user to specify any fields and values in a query. For example, "I want to see a report about all users from Germany that have flash installed" or "I want to see a report about all users from Africa that are using FireFox 1" I do not believe that storing all of the data in one big table is the correct way to go about this. So, I am asking for suggestions, pointers and any kind of info that anyone can share on how to store this data set in an optimized manner. Also, I have created a prototype and done some testing using the colossal table. Actually finding all of the rows that satisfy the query is pretty fast and is not a problem. The bottleneck in the whole process is actually counting each data point (how many times a url was visited, or how many times a url referred the user to the website). So more specifically I am wondering if there is way to store and retrieve the data such that it speeds up the counting of the statistics. Lastly, this will become an open source tool that is akin to urchin, awstats, etc. The difference is that this software is part of a suite of tools for doing online web surveys and it maps web stats to the survey respondent data. This can give web site managers a very clear view of what type of people come to the site and how those types use the site. Thanks in advance, exty
В списке pgsql-performance по дате отправления: