VictorOps at AWS re:Invent 2014
Taming the Jungle of Tools
More resources result in proliferation of monitoring tools for both systems and applications. On average three to a dozen tools are used by the typical enterprise for monitoring, requiring service personnel to monitor and open multiple systems to try to figure out what is going on during an outage and craft a fix. The problem is that lots of automation can result in multiple false alarms and warnings, in a stream of information that may go to the wrong person – resulting in non-actionable noise. The Victor Ops rules engine allows intelligent de-rating of alarms yet preserves the total stream for diagnostic purposes.