MDI: Monitoring Driven Infrastructure?

Adam and I attended the London Infracoders Meetup last night which featured a demo of serverspec and Beaker. When I asked Adam what he thought he wasn’t impressed*. “I don’t see the point of this if you’re using Puppet unless you’re worried about typos or you don’t control your production monitoring. It duplicates what you already have with Puppet’s DSL, which is why we stopped using rspec-puppet in the first place.”

I realized that Adam was correct, that the sort of automated tests I’m used to as a developer are functionally equivalent of the monitoring checks system administrators are already heavily using. (An easy leap for me to make since Michael Bolton already convinced me that automated tests are checks, not testing.) Over time I’ve seen the migrations of testing from unit tests on the developer desktop to more tests further and further down the deployment pipeline. This led me to wonder how far monitoring could make the same march but in reverse.

We already use our production monitoring in our test environments, but we don’t tend to use them in our local Puppet development environments. But why? Couldn’t we use our monitoring in TDD fashion? Write a monitoring check, see it fail, then write the Puppet code to make it pass? (This is the same motivation as using your tests as monitoring such as with Cucumber-Nagios but working in the other direction.)

We haven’t tried this experiment yet but I’m curious if this is a pattern others have attempted, and if so, how did it work for you?

* Adam did think serverspec might be a plausible replacement for our existing NRPE checks, a way to uniformly structure our monitoring and to allow easier independent development of the checks