PROPOSAL: tracking aggregated numbers from pg_stat_database
От | Tomas Vondra |
---|---|
Тема | PROPOSAL: tracking aggregated numbers from pg_stat_database |
Дата | |
Msg-id | 51607CC2.9080702@fuzzy.cz обсуждение исходный текст |
Ответы |
Re: PROPOSAL: tracking aggregated numbers from pg_stat_database
Re: PROPOSAL: tracking aggregated numbers from pg_stat_database |
Список | pgsql-hackers |
Hi, I'm regularly using pg_stat_database view to analyze various aspects of behavior of the cluster. The #1 issue I'm constantly running into is that to get cluster-level view (across all the databases), the table needs to be aggregated like this: SELECT SUM(blks_hit) blks_hit, SUM(blks_read) blks_read FROM pg_stat_database This more or less works in stable environments, but once you start dropping databases (think of hosting with shared DB server) it gets unusable because after DROP DATABASE the database suddenly disappears from the sum. Therefore I do propose tracking the aggregated stats, similar to the pg_stat_bgwriter view. This does not require new messages (thanks to reuse of the existing messages), and I expect the overhead to be negligible (a few bytes of storage, minimal CPU). I think it does not make sense to merge this into pg_stat_bgwriter, creating a new view (can't think of a good name though), seems like a much better choice to me. And now a bit more detailed explanation of the issues ... Analysis is usually based on comparing two snapshots (say a few minutes apart), and this makes is rather much more difficult because the dropped databases suddenly disappear from the second snapshot. Say for example there are two databases, A and B, with stats snapshotted at T1 and T2. The database B is dropped sometimes between the snapshots. So the snaphots look like this: time | db | blks_read | bkls_hit ----------------------------------- T1 | A | 1000000 | 500000 T1 | B | 1000000 | 500000 T2 | A | 1500000| 750000 Now, the aggregated data look like this: time | blks_read | bkls_hit ------------------------------ T1 | 2000000 | 1000000 T2 | 1500000 | 750000 So the difference (T2-T1) is blks_read | bkls_hit ---------------------- -500000 | -250000 Yes, negative values do not make much sense. It's very difficult to detect such behavior and account for that. It might be possible to solve (some of) the issues with elaborate snapshotting system, but it's awkward / difficult to use. Adding a new system view works much nicer. regards Tomas
В списке pgsql-hackers по дате отправления: