Monthly Archives

So maybe the consequences of an outage in your infrastructure will not be as calamitous as the BP oil spill in the Gulf, but the effect on your enterprise may feel like it.  Which is why we could all use B.P. as a case study in how not to treat your monitoring.

News reports detail allegations that some of the alarms on the failed rig were “inhibited”, because “they did not want people to wake up at 3 a.m. due to false alarms”.  The rig’s operator responded, “Repeated false alarms increase risk and decrease rig safety”.
Read more »


World Cup Soccer Monitoring

Posted by & filed under Uncategorized .

You (or at least I) wouldn’t think world cup fever would affect the more mundane world of data center monitoring. Unless you work for Twitter, or some other web presence overloaded by unprecedented surges in network traffic due to people checking in on, and commenting about, the World Cup.  Twitter found that having a network that “wasn’t appropriately being monitored” supporting two critical components was not a good idea, and led…to an outage (or several fail whales.)

Twitter is addressing issues, including their monitoring. (As we’ve noted before, if monitoring did not alert about an incident at the earliest possible time, the incident should never be closed until the monitoring is improved to warn of the issue in advance.)

In our view, the best monitoring approach is to have your monitoring software assume everything should be monitored to production levels, automatically, and not need your Ops team to tell it.

Repel your own flying whales by using data center monitoring automation as much as possible.  Let the humans go play soccer, or blow their vuvuzelas!

Popular Posts
Subscribe to our blog.