All the things wrong with monitoring today – Part 1
Monitoring today sucks. Big time. It sucks so bad, it’s not even funny. The amount of time spent configuring stuff, dealing with problems when it’s already too late, and the amount of things your monitoring system could be monitoring, but isn’t, are all staggering. I’ll be spending a couple of posts whining about this. Who knows? Maybe I have a solution up my sleeve by the end of it.
Active vs. passive monitoring
Active monitoring.. It sounds cool. Way cooler than passive. Most of the time, if you have a choice between an active and a passive something, you go with the active one, right? Well, not this time.
The amount of times I’ve seen people set up their monitoring system to access an HTTP URL especially crafted to be useless, to simply respond to the probe as quickly as possible, is ridiculous. It’s surely active, but it’s almost entirely useless. Sure, if this is a service noone uses, it’s probably fine, but if this is a service that has almost any sort of real world use, in the customary 5 minutes between each of these “pings”, there will have been dozens, scores, if not hundreds or thousands of actual requests. Requests that actually did something. Exercised your service at least to some extent. Sadly, this information is almost universally ignored.
Telling Apache to log the amount of time it took to serve a request is trivial. Collecting this information is trivial. Feeding that data to your monitoring system (if not on a per-request basis, just a maximum request time over the last 10 seconds would be a vast improvement) really shouldn’t be too hard. So why don’t you?
2 Responses to All the things wrong with monitoring today – Part 1
Leave a Reply Cancel reply
Pages
Recent comments
- Soren on If you’re trying to do asymmetric routing in Ubuntu 12.04..
- Rik on If you’re trying to do asymmetric routing in Ubuntu 12.04..
- mathrock on If you’re trying to do asymmetric routing in Ubuntu 12.04..
- Mark Unwin on All the things wrong with monitoring today – Part 2
- Soren on All the things wrong with monitoring today – Part 2
Archives
- January 2013 (1)
- July 2012 (1)
- January 2012 (1)
- October 2011 (1)
- September 2011 (2)
- August 2011 (1)
- June 2011 (2)
- February 2011 (2)
- January 2011 (2)
- October 2010 (1)
- July 2010 (1)
- May 2010 (3)
- April 2010 (3)
- January 2010 (1)
- March 2009 (3)
- January 2009 (2)
- January 2007 (3)
- September 2006 (1)
Categories
- Cloud computing (5)
- Code (13)
- OpenStack (3)
- Rackspace (5)
- Ubuntu (18)
- Uncategorized (1)
- Work (9)
Blogroll





[...] All the things wrong with monitoring today – Part 1 [...]
[...] been more than month since my last post, and not a darned thing has changed. Monitoring today still sucks. In the last installment I ranted [...]