Re: RFC: Timing Events
От | Satoshi Nagayasu |
---|---|
Тема | Re: RFC: Timing Events |
Дата | |
Msg-id | 5096352B.4020108@uptime.jp обсуждение исходный текст |
Ответ на | Re: RFC: Timing Events (Josh Berkus <josh@agliodbs.com>) |
Ответы |
Re: RFC: Timing Events
Re: RFC: Timing Events |
Список | pgsql-hackers |
(2012/11/03 10:44), Josh Berkus wrote: > >> I don't see all that going into core without a much bigger push than I >> think people will buy. What people really want for all these is a >> proper trending system, and that means graphs and dashboards and >> bling--not a history table. > > Well, I'm particularly thinking for autoconfiguration. For example, to > set vacuum_freeze_min_age properly, you have to know the XID "burn rate" > of the server, which is only available via history. I really don't want > to be depending on a graphical monitoring utility to find these things out. > >> This whole approach has the assumption that things are going to fall off >> sometimes. To expand on that theme for a second, right now I'm more >> worried about the "99%" class of problems. Neither pg_stat_statements >> nor this idea are very good for tracking the rare rogue problem down. >> They're both aimed to make things that happen a lot more statistically >> likely to be seen, by giving an easier UI to glare at them frequently. >> That's not ideal, but I suspect really fleshing the whole queue consumer >> -> table idea needs to happen to do much better. > > I'm just concerned that for some types of incidents, it would be much > more than 1% *of what you want to look at* which fall off. For example, > consider a server which does 95% reads at a very high rate, but has 2% > of its writes cronically having lock waits. That's something you want > to solve, but it seems fairly probably that these relatively infrequent > queries would have fallen off the bottom of pg_stat_statements. Same > thing with the relative handful of queries which do large on-disk sorts. > > The problem I'm worried about is that pg_stat_statements is designed to > keep the most frequent queries, but sometimes the thing you really need > to look at is not in the list of most frequent queries. I think auto_explain would help you solve such rare incidents if it could dump several statistics into server log, including lock waits and block reads/writes statistic per-session, for example. Do we have something to add to auto_explain? Regards, -- Satoshi Nagayasu <snaga@uptime.jp> Uptime Technologies, LLC. http://www.uptime.jp
В списке pgsql-hackers по дате отправления: